scitacean.File#
- class scitacean.File(local_path, remote_path, remote_gid, remote_perm, remote_uid, checksum_algorithm=None, _remote_size=None, _remote_creation_time=None, _remote_checksum=None, _checksum_cache=None)[source]#
- Store local and remote paths and metadata for a file. - There are two central properties: - remote_path: Path to the remote file relative to the dataset’s- source_folder. This is always set, even if the file does not exist on the remote filesystem.
- local_path: Path to the file on the local filesystem. Is- Noneif the file does not exist locally.
 - Files can be in one of three states and the state can be changed as shown below. The state can be queried using - File.is_on_local()and- File.is_on_remote().- local remote │ │ │ uploaded downloaded │ │ │ └───────────> local+remote <───────────┘ - Constructors - from_local(path, *[, remote_path, ...])- Construct a File object for a file on the local filesystem. - from_remote(remote_path, size, creation_time)- Construct a new file object for a remote file. - from_download_model(model, *[, ...])- Construct a new file object from a SciCat download model. - Methods - checksum()- Return the checksum of the file. - downloaded(*, local_path)- Return new file metadata after a download. - Check if the file on local is up-to-date. - make_model(*[, for_archive])- Build a pydantic model for this file. - remote_access_path(source_folder)- Full path to the file on the remote if it exists. - uploaded(*[, remote_path, remote_uid, ...])- Return new file metadata after an upload. - Check that the file on disk matches the metadata. - Attributes - Algorithm to use for checksums. - The logical creation time of the SciCat file. - True if the file is on local. - True if the file is on remote. - The size in bytes of the file. - Path to the file on the local filesystem. - Path to the file on the remote filesystem. - Unix group ID on remote. - Unix file mode on remote. - Unix user ID on remote. - checksum()[source]#
- Return the checksum of the file. - This can take a long time to compute for large files. - If the file exists on local, return the current checksum of the local file. Otherwise, return the stored checksum in the catalogue. 
 - property creation_time: datetime#
- The logical creation time of the SciCat file. - If the file exists on local, return the time the local file was last modified. Otherwise, return the stored time in the catalogue. 
 - downloaded(*, local_path)[source]#
- Return new file metadata after a download. - Assumes that the input file exists on remote. The returned object is on both local and remote. 
 - classmethod from_download_model(model, *, checksum_algorithm=None, local_path=None)[source]#
- Construct a new file object from a SciCat download model. 
 - classmethod from_local(path, *, remote_path=None, remote_uid=None, remote_gid=None, remote_perm=None)[source]#
- Construct a File object for a file on the local filesystem. - The returned object references a file that exists locally but not on the remote. However, it does contain a - remote_pathwhich is constructed from the provided local path or from the provided- remote_path.- Parameters:
- remote_path ( - Union[- RemotePath,- str,- None], default:- None) – Path on the remote, relative to the- source_folderof a dataset. By default, it is constructed as- path.name.
- remote_uid ( - Optional[- str], default:- None) – User ID on the remote. Will be determined automatically on upload.
- remote_gid ( - Optional[- str], default:- None) – Group ID on the remote. Will be determined automatically on upload.
- remote_perm ( - Optional[- str], default:- None) – File permissions on the remote. Will be determined automatically on upload.
 
- Returns:
- File– A new file object.
 
 - classmethod from_remote(remote_path, size, creation_time, checksum=None, checksum_algorithm=None, remote_uid=None, remote_gid=None, remote_perm=None)[source]#
- Construct a new file object for a remote file. - The local path of the returned - Fileis- None.- Parameters:
- remote_path ( - str|- RemotePath) – Path the remote file relative to the dataset’s source folder.
- size ( - int) – Size in bytes on the remote filesystem.
- creation_time ( - datetime|- str) – Date and time the file was created on the remote filesystem. If a- str, it is parsed using- dateutil.parser.parse.
- checksum ( - Optional[- str], default:- None) – Checksum of the file.
- checksum_algorithm ( - Optional[- str], default:- None) – Algorithm used to compute the given checksum. Must be passed when- checksum is not None.
- remote_uid ( - Optional[- str], default:- None) – User ID on the remote.
- remote_gid ( - Optional[- str], default:- None) – Group ID on the remote.
- remote_perm ( - Optional[- str], default:- None) – File permissions on the remote.
 
- Returns:
- File– A new file object.
 - Added in version 23.10.0. 
 - local_is_up_to_date()[source]#
- Check if the file on local is up-to-date. - Returns:
- bool– True if the file exists on local and its checksum matches the stored checksum for the remote file.
 
 - make_model(*, for_archive=False)[source]#
- Build a pydantic model for this file. - Parameters:
- for_archive ( - bool, default:- False) – Select whether the file is stored in an archive or on regular disk, that is whether it belongs to a Datablock or an OrigDatablock.
- Returns:
- UploadDataFile– A new pydantic model.
 
 - remote_access_path(source_folder)[source]#
- Full path to the file on the remote if it exists. - Return type:
 
 - 
remote_path: RemotePath#
- Path to the file on the remote filesystem. 
 - property size: int#
- The size in bytes of the file. - If the file exists on local, return the current size of the local file. Otherwise, return the stored size in the catalogue. 
 - uploaded(*, remote_path=None, remote_uid=None, remote_gid=None, remote_perm=None, remote_creation_time=None, remote_size=None)[source]#
- Return new file metadata after an upload. - Assumes that the input file exists on local. The returned object is on both local and remote. - Parameters:
- remote_path ( - Union[- RemotePath,- str,- None], default:- None) – New remote path.
- remote_uid ( - Optional[- str], default:- None) – New user ID on remote, overwrites any current value.
- remote_gid ( - Optional[- str], default:- None) – New group ID on remote, overwrites any current value.
- remote_perm ( - Optional[- str], default:- None) – New unix permissions on remote, overwrites any current value.
- remote_creation_time ( - Optional[- datetime], default:- None) – Time the file became available on remote. Defaults to the current time in UTC.
- remote_size ( - Optional[- int], default:- None) – File size on remote.
 
- Returns:
- File– A new file object.
 
 - validate_after_download()[source]#
- Check that the file on disk matches the metadata. - Compares file size and, if possible, its checksum. Raises on failure. If the function returns without exception, the file is valid. - Raises:
- IntegrityError – If a check fails. 
- Return type: