Fixed bug: repeated meeting ID crashed script
authorRhett Aultman <rhett.aultman@samsara.com>
Fri, 7 Jan 2022 18:27:20 +0000 (13:27 -0500)
committerRhett Aultman <rhett.aultman@samsara.com>
Fri, 7 Jan 2022 18:27:20 +0000 (13:27 -0500)
The script was not able to handle the case where the same meeting ID was
encountered more than once.  This is because each meeting ID is a
directory in the downloads directory, and creating a directory that
already exists is an error in Python.

While I was at it, I also put in a check for existing files so that
restarting the downloader will not start over from the beginning again.
This makes it easier to restart, but could lead to a file being
partially downloaded and then skipped over subsequently, but it is a
better place to be than restarting a download of hundreds of videos from
scratch.

README.md
cloudlink.py

index 7078b595f82696e8b887dc2f1cc0228dd9743c2f..58f41aea5914e6f56da10778b5aa2e2f3e885b75 100644 (file)
--- a/README.md
+++ b/README.md
@@ -15,6 +15,11 @@ If you haven't done that before you will need to use PIP (https://pip.readthedoc
 
 Exact command is "pip install -r requirements.txt"
 
+Next, edit cloudlink.py to do the following:
+* Set your JWT and USER ID
+* Set the destination path for the downloads
+* Set the desired range of years to check
+
 Then run "python cloudlink.py"
 
 That will download all the recordings wherever you specify. 
index c8d485f940aeef0e1fb0733baac4f6d0ea5ce315..c41b5a7d0ebedb58345da63531012dc2fd3b5d75 100644 (file)
@@ -17,12 +17,12 @@ headers = {
        }
 
 # Put your own download path here, I used an external hard drive so mine will differ from yours
-PATH = '/Volumes/Ext3/Zoom/'
+PATH = '/media/rhett/Seagate Portable Drive/VAM_2021/'
 
 
 
 def main():
-       for year in range(2018,2022):
+       for year in range(2021,2022):
                for month in range(1,13):
                        next_month = month + 1
                        next_year = year
@@ -61,12 +61,13 @@ def get_recording(start_date, next_date):
        # print(data['to'])
 
        for meeting in data['meetings']:
-               os.mkdir('{}{}'.format(PATH, meeting['id']))
+               if not os.path.isdir('{}{}'.format(PATH, meeting['id'])):
+                       os.mkdir('{}{}'.format(PATH, meeting['id']))
                for record in meeting['recording_files']:
                        if 'status' in record and record['status'] != 'completed':
                                continue
 
-                       download_recording(
+               download_recording(
                                record['download_url'], 
                                record['recording_start'].replace(':','-'),
                                record['file_type'],
@@ -88,13 +89,15 @@ def download_recording(download_url, filename, filetype, meeting_id):
                suffix = 'JSON'
        suffix = suffix.lower()
        local_filename = '{}/{}/{}_{}.{}'.format(PATH, meeting_id, filename, filetype, suffix)
+       if os.path.isfile(local_filename):
+               print ('file {} exists, skipping download!'.format(local_filename))
+       else:
+               with open(local_filename, 'wb') as f:
+                       for chunk in response.iter_content(chunk_size=8192):
+                               print (len(chunk))
+                               f.write(chunk)
 
-       with open(local_filename, 'wb') as f:
-               for chunk in response.iter_content(chunk_size=8192):
-                       print (len(chunk))
-                       f.write(chunk)
 
-          
 if __name__ == '__main__':
        main()