uriutils-0.1 documentation¶
Welcome to the documentation for uriutils. This package aims to make it transparent to the user and the developer the underlying storage system (i.e., S3, Google Cloud, local filesystems, etc) by wrapping the different protocols in a common interface.
Currently, the following storage systems are supported:
- Local filesystem (i.e., empty or
file
scheme) - Amazon Web Services Simple Storage Services (S3) using
S3.Client
(i.e.,s3
scheme) - Amazon Web Services Simple Notification Service (SNS) using
SNS.Client
(i.e.,sns
scheme) - Google Cloud Storage using
google.cloud.storage.client
(i.e.,gcs
orgs
scheme) - HTTP using
requests
(i.e.,http
orhttps
scheme)
API Documentation¶
Read / Write functions¶
-
uriutils.uriutils.
uri_open
(uri, mode='rb', auto_compress=True, in_memory=True, delete_tempfile=True, textio_args={}, storage_args={})[source]¶ Opens a URI for reading / writing. Analogous to the
open()
function. This method supportswith
context handling:with uri_open('http://www.example.com', mode='r') as f: print(f.read())
Parameters: - uri (str) – URI of file to open
- mode (str) – Either
rb
,r
,w
, orwb
for read/write modes in binary/text respectiely - auto_compress (bool) – Whether to automatically use the
gzip
module with.gz
URIsF - in_memory (bool) – Whether to store entire file in memory or in a local temporary file
- delete_tempfile (bool) – When
in_memory
isFalse
, whether to delete the temporary file on close - textio_args (dict) – Keyword arguments to pass to
io.TextIOWrapper
for text read/write mode - storage_args (dict) – Keyword arguments to pass to the underlying storage object
Returns: file-like object to URI
-
uriutils.uriutils.
uri_read
(*args, **kwargs)[source]¶ Reads the contents of a URI into a string or bytestring. See
uri_open()
for complete description of keyword parameters.Returns: Contents of URI Return type: str, bytes
-
uriutils.uriutils.
uri_dump
(uri, content, mode='wb', **kwargs)[source]¶ Dumps the contents of a string/bytestring into a URI. See
uri_open()
for complete description of keyword parameters.Parameters:
URI information¶
-
uriutils.uriutils.
uri_exists
(uri, storage_args={})[source]¶ Check if URI exists.
Parameters: Returns: True
if URI existsReturn type:
-
uriutils.uriutils.
uri_exists_wait
(uri, timeout=300, interval=5, storage_args={})[source]¶ Block / waits until URI exists.
Parameters: - uri (str) – URI to check existence
- timeout (float) – Number of seconds before timing out
- interval (float) – Calls
uri_exists()
everyinterval
seconds - storage_args (dict) – Keyword arguments to pass to the underlying storage object
Returns: True
if URI existsReturn type:
Argument Parser types¶
-
class
uriutils.uriutils.
URIType
[source]¶ A convenience class that can be used as the
type
argument toargparse.ArgumentParser.add_argument()
. It will return the result ofurllib.parse.urlparse()
.
-
class
uriutils.uriutils.
URIFileType
(mode='rb', **kwargs)[source]¶ A convenience class that can be used as the
type
argument toargparse.ArgumentParser.add_argument()
. It will return a file-like object usinguri_open()
.See
uri_open()
for complete description of keyword parameters.
-
class
uriutils.uriutils.
URIDirType
(create=False, storage_args={})[source]¶ A convenience class that can be used as the
type
argument toargparse.ArgumentParser.add_argument()
. It will return the result ofurllib.parse.urlparse()
.Parameters:
Storages Documentation¶
This module defines all the storage systems supported by uriutils.
-
class
uriutils.storages.
URIBytesOutput
(uri_obj)[source]¶ A BytesIO object for output that flushes content to the remote URI on close.
-
name
¶
-
-
class
uriutils.storages.
BaseURI
(storage_args={})[source]¶ This is the base URI storage object that is inherited by the different storage systems. It defines the methods and operations that can be “conducted” on a URI. Almost all of these methods have to be implemented by a storage class.
-
SUPPORTED_SCHEMES
= []¶ Defines the schemes supported by this storage system.
-
VALID_STORAGE_ARGS
= []¶ The set of
storage_args
keyword arguments that is handled by this storage system.
-
__init__
(storage_args={})[source]¶ Parameters: storage_args (dict) – Arguments that will be applied to the storage system for read/write operations
-
dir_exists
()[source]¶ Check if the URI exists as a directory.
Returns: True
if URI exists as a directoryReturn type: bool
-
download_file
(filename)[source]¶ Download the binary content stored in the URI for this object directly to local file.
Parameters: filename (str) – Filename on local filesystem
-
join
(path)[source]¶ Similar to
os.path.join()
but returns a storage object instead.Parameters: path (str) – path to join on to this object’s URI Returns: a storage object Return type: BaseURI
-
Local filesystem¶
AWS Simple Storage Service¶
-
class
uriutils.storages.
S3URI
(bucket, key, storage_args={})[source]¶ Storage system for AWS S3.
-
VALID_STORAGE_ARGS
= ['CacheControl', 'ContentDisposition', 'ContentEncoding', 'ContentLanguage', 'ContentLength', 'ContentMD5', 'ContentType', 'Expires', 'GrantFullControl', 'GrantRead', 'GrantReadACP', 'GrantWriteACP', 'Metadata', 'ServerSideEncryption', 'StorageClass', 'WebsiteRedirectLocation', 'SSECustomerAlgorithm', 'SSECustomerKey', 'SSEKMSKeyId', 'RequestPayer', 'Tagging']¶ Storage arguments allowed to pass to
S3.Client
methods.
-
Google Cloud Storage¶
-
class
uriutils.storages.
GoogleCloudStorageURI
(bucket, key, storage_args={})[source]¶ Storage system for Google Cloud storage.
-
SUPPORTED_SCHEMES
= set(['gcs', 'gs'])¶ Supported schemes for
GoogleCloudStorageURI
.
-
VALID_STORAGE_ARGS
= ['chunk_size', 'encryption_key']¶ Storage arguments allowed to pass to
google.cloud.storage.client
methods.
-
__init__
(bucket, key, storage_args={})[source]¶ Parameters: - bucket (str) – Bucket name
- key (str) – Key to file
- storage_args (dict) – Keyword arguments that are passed to
google.cloud.storage.client
-
HTTP¶
-
class
uriutils.storages.
HTTPURI
(url, raise_for_status=True, method=None, storage_args={})[source]¶ Storage system for HTTP/HTTPS.
-
VALID_STORAGE_ARGS
= ['params', 'headers', 'cookies', 'auth', 'timeout', 'allow_redirects', 'proxies', 'verify', 'stream', 'cert', 'method']¶ Keyword arguments passed to
requests.request()
.
-
__init__
(url, raise_for_status=True, method=None, storage_args={})[source]¶ Parameters: - uri (str) – HTTP URI.
- raise_for_status (str) – Raises a
requests.RequestException
when the response status code is not 2xx (i.e., callsrequests.Request.raise_for_status()
) - method (str) – Overrides the default method for all HTTP operations.
- storage_args (dict) – Keyword arguments that are passed to
requests.request()
-
put_content
(content)[source]¶ Makes a
PUT
request with the content in the body.Raise: An requests.RequestException
if it is not 2xx.
-
AWS Simple Notification Service¶
-
class
uriutils.storages.
SNSURI
(topic_name, region, storage_args={})[source]¶ Storage system for AWS Simple Notification Service.
-
VALID_STORAGE_ARGS
= ['Subject', 'MessageAttributes', 'MessageStructure']¶ Keyword arguments passed to
SNS.Client.publish()
.
-
__init__
(topic_name, region, storage_args={})[source]¶ Parameters: - topic_name (str) – Name of SNS topic for publishing; it can be either an ARN or just the topic name (thus defaulting to the current role’s account)
- region (str) – AWS region of SNS topic (defaults to current role’s region)
- storage_args (dict) – Keyword arguments that are passed to
SNS.Client.publish()
-