Downloading media from Avalon¶
Use if you are trying to download video/audio files or if you need to pull down metadata stored on Avalon.
First method¶
This method requires that you have an API key and access to the access.library.tamu.edu server. It is faster than the second method.
Use this code
Install ffmpeg, paramiko, and python. (Instructions for download are in the code.)
Enter AVALON_API_KEY (your API key) and AVALON_HOST_URL. If pulling from Avalon Prod, it should look like
AVALON_HOST_URL = r'https://avalon.library.tamu.edu'Enter SSH_HOSTNAME and SSH_PORT:
SSH_HOSTNAME = 'access.library.tamu.edu'andSSH_PORT = 22.Enter SSH_USERNAME and SSH_PASSWORD.
Enter AVALON_MOUNT PATH:
AVALON_MOUNT_PATH = '/mnt/avalon_prod'. This will be different if you are pulling from Avalon Pre.Go to the collection and copy the string after the AVALON_HOST_URL.
In a terminal, run the program followed by the string from the AVALON_HOST_URL. For example:
python fetch-avalon-assets-by-collection 123456abcdef
Second method¶
This method requires that you have an API key.
Download pyavalon.
In the terminal, type
get_file_ids_from_a_collection -c "{collection id}" -i {pre or prod} --download -f {file output directory} -o {output.csv}
Downloading Large Collections¶
Sometimes, a collection will be so large, any attempt to download it will result in an error. To work around this issue, you must download only part of the collection at a time.
In the terminal type:
pyavalon get_file_ids_from_a_collection -c "{collection id}" -i prod --download -u {Your_Username} -f output-folder --get_range --start 1 --end 1
It is advised you go page by page to avoid crashes. To do this, make sure the start and end of the range are the same.
Downloading Large Collections With Less Steps¶
To make a collection download without having to run pyavalon dozens of times, you can use code to automatically repeat pyavalon.
Use the following code:
from time import sleep
import subprocess
import argparse
parser = argparse.ArgumentParser(prog="Batch Avalon")
parser.add_argument("--collection", "-c", help="Specify Collection")
parser.add_argument("--instance", "-i", help="Pre or Prod")
parser.add_argument("--file", "-f", help="folder where audio/video files will be saved")
parser.add_argument("--start", "-s")
parser.add_argument("--end", "-e")
parser.add_argument("--output", "-o", help="output csv filename")
args = parser.parse_args()
current = int(args.start)
end = int(args.end)
while current <= end:
print(f"Downloading Page {current}")
command = [
"pyavalon",
"get_file_ids_from_a_collection",
"-c",
args.collection,
"-i",
args.instance,
"--download",
"-u",
"cbarr",
"-f",
args.file,
"--get_range",
"--start",
str(current),
"--end",
str(current),
"-o",
f"temp_{current}.csv"
]
subprocess.run(command)
current += 1
print("Taking a quick nap!")
sleep(10)
This will result in downloading the entire collection to a folder. Furthermore, an output.csv will be created for each page that has been downloaded.