lor.util package

Submodules

lor.util.cli module

Utilities for command-line interfaces

class lor.util.cli.CliCommand

Bases: object

Abstract class that represents a CLI command (e.g. run, import)

description()

Override to provide a method that returns a human-readable description of the command. :return:

name()

Override to provide a method that returns a unique name for the command. :return:

run(argv)

Override to provide a method that is called with the command’s arguments :param argv:

lor.util.cli.add_command_to_parser(arg_parser, command)
lor.util.cli.add_commands_as_subcommands(arg_parser, subcommands)
lor.util.cli.add_properties_override_arg(subparser)
lor.util.cli.extract_property_overrides(namespace)

lor.util.reflection module

Utilities for reflecting over python

lor.util.reflection.classes_in_module(module)

Returns an iterable of python classes found in module

:param module The module to search in :return An iterable of classes found in module

lor.util.reflection.classes_in_pkg(package)

Returns an iterable of python classes found in package

Parameters:package – The package to search in
Returns:An iterable of classes found in package
lor.util.reflection.filter_subclasses(superclass, iter)

Returns an iterable of class obects which are subclasses of superclass filtered from a source iteration.

Parameters:superclass – The superclass to filter against
Returns:An iterable of classes which are subclasses of superclass
lor.util.reflection.is_pkg(mod_or_pkg)

Returns True if mod_or_pkg is a package

lor.util.reflection.load_class_by_name(full_class_name)

Load a class by its fully-qualified name.

“Fully qualified name” is a LoR-specific. It is PACKAGE.MODULE.CLASS_NAME. Examples:

  • lor.tasks.fs.EnsureExistsOnLocalFilesystemTask

Raises an exception if full_class_name cannot be found.

Parameters:full_class_name – Fully-qualified name of a class as a string
Returns:A python class
lor.util.reflection.subclasses_in_pkg(package, superclass)

Returns an iterable of classes in a package which are subclasses of a superclass.

Parameters:
  • package – The package to search in
  • superclass – The superclass to filter against
Returns:

An iterable of classes in package which are subclasses of superclass

lor.util.subprocess module

Utilities for running separate subprocesses

Many of these functions wrap the python standard library’s subprocess functions. The reason to use these functions instead of the standard library’s is because, in many cases:

  • They handle signal capture (e.g. SIGINT), which is important when tasks are long-running and likely to be cancelled midway
  • They stream the subprocess’s output directly to the calling process’s stdio, rather than collecting it into memory and dumping it after execution finishes: important for long-running tasks which output a lot of logging etc.
  • Output streaming is handled functionally: downstream users don’t need to worry about spawning threads etc.: they just write lambdas/functions to filter+reduce process outputs: important when a subprocess produces a lot of logging output (e.g. long-running Hadoop MR jobs) and you don’t want to risk a memory leak.
lor.util.subprocess.call(args)

Synchronously run the command described by args, returning an exit code integer once the subprocess exits. :param args: List of args, where the first arg is the application’s name :return Exit code of the subprocess

lor.util.subprocess.call_and_write_stdout_to_file(args, output_path)
lor.util.subprocess.call_with_output_reducers(args, stdout_initial_state=None, stdout_reducer=<function <lambda>>, stderr_initial_state=None, stderr_reducer=<function <lambda>>)
lor.util.subprocess.call_with_stdout_reducer(args, initial_reducer_state, output_reducer)
lor.util.subprocess.run_luigi_task(task_class, task_args)

Module contents

A module containing general helpers

lor.util.base36_str(desired_length=5)

Returns a randomly-generated base36 string with length desired_length.

Parameters:desired_length – Length of the string to generate
Returns:A randomly-generated base36 string
lor.util.bullet_point_list(str_list)
lor.util.file_uri(path)

Returns path as a file URI

Relative paths are resolved relative to the current working directory.

Parameters:path – A path string to convert
Returns:A file URI string (e.g. file:///usr)
lor.util.merge(h1, h2)

Returns a new dictionary containing the key-value pairs of h1 combined with the key-value pairs of h2.

Duplicate keys from h2 overwrite keys from h1.

Parameters:
  • h1 – A python dict
  • h2 – A python dict
Returns:

A new python dict containing the merged key-value pairs

lor.util.or_join(strs)
lor.util.read_file_to_string(path)

Returns the contents of a file at path as a string.

Parameters:

path – A path string

Returns:

Contents of the file as a string

Raises:
  • FileNotFoundError – If path does not exist
  • UnicodeDecodeError – If file does not contain UTF-8/ASCII text
lor.util.to_camel_case(snake_case_str)

Returns snake_case_str (e.g. some_str) in CamelCase (e.g. SomeStr).

Parameters:snake_case_str – A string in snake_case
Returns:The string in CamelCase
lor.util.to_snake_case(camel_case_str)

Returns camel_case_str (e.g. SomeStr) as snake_case (e.g. some_str)

Parameters:camel_case_str – A string in CamelCase
Returns:The string in snake_case
lor.util.try_construct_suggestions_msg(dic, k)
lor.util.try_get_val_or_key_error(dic, k)
lor.util.uri_subfolder(base, subfolder)

Returns a URI created by addin subfolder (a path) to the end of base (a URI).

Assumes the protocol supports relative pathing. This function exists because python (stupidly) uses a whitelist for custom protocols (e.g. hdfs) when using urljoin. See: https://bugs.python.org/issue18828

Parameters:
  • base – A URI
  • subfolder – A path

:return A URI

lor.util.write_str_to_file(path, s)

Write s to a file located at path.

Parameters:
  • path – Destination file path
  • s – The string to write

:raises FileExistsError if file exists