Primary Modules

ets_generate_data.py

This module is the main entry point for generating a set of simulated tilt stacks.

The tilt stacks are generated using the TEM-Simulator software package, taking as input a PDB or MRC map file of a particular particle of interest and generating a cryo-EM image stack containing the given particle. More complicated particle sources (i.e. randomizing the orientations of the particles and adding membrane segments around them) can be used by providing a custom Assembler class which interfaces with a Chimera REST Server to assemble these particle sources to feed in to each run of the TEM-Simulator.

Kyung Min Shin, 2020

ets_generate_data.configure_root_logger(queue)

Helper function to initialize and configure the main logger instance to handle log messages.

Parameters

queue – An instance of the multiprocessing.queue class which provides thread-safe handling of log messages coming from many child processes.

Returns: None

ets_generate_data.main(configs)

The main driver process, which sets up top-level run directories and spawns necessary child processes.

Returns: None

ets_generate_data.parse_inputs()

Instantiate and set up the command line arguments parser for the ets_generate_data module

Returns: None

ets_generate_data.run_chimera_server(chimera_path, commands_queue, process_events)

Run the Chimera REST Server in a child process.

ETSimulations uses a REST Server instance of Chimera to allow Assembler modules to build up particle models, shared by all multiprocessing child processes. Each child process whose Assembler wishes to use the Chimera server will send the entire set of commands to generate a model so that Chimera sessions remain separate.

Parameters
  • chimera_path

  • commands_queue – The multiprocessing queue which maintains thread-safe piping of Chimera commands to make HTTP GET requests with, filled by particle Assemblers in other processes

  • process_events – A dictionary linking each child process ID to its process-specific multiprocessing acknowledgement event which signals to Assemblers when the commands sent by that Assembler have been completed

Returns: None

ets_generate_data.run_process(configs, pid, metadata_queue, chimera_commands_queue, ack_event, complete_event)

Drives a single child process of the simulation pipeline.

A temporary data directory is first created for use only by the child process. An Assembler instance is created, and for each tiltseries simulation assigned to the child process, the appropriate number of particles are assembled and passed along to the TEM-Simulator to simulate tilt stacks with.

Parameters
  • configs – The command line arguments passed to the main ets_generate_data process

  • pid – The process ID of this child process

  • metadata_queue – The multiprocessing queue used for sending metadata log messages to the central log listener

  • chimera_commands_queue – The multiprocessing queue where commands for the Chimera REST Server can be sent by the particle Assembler

  • ack_event – A child process-specific multiprocessing Event to subscribe to in order to know when the Chimera commands we send off to the server have been completed

  • logger – Python logging module logger to send logs to

  • complete_event – A child process-specific multiprocessing Event used to indicate to the main process that this child has finished processing its jobs

Returns: None

ets_generate_data.scale_mrc(filename, apix=1.0)

Given an outputted raw tilt stack from the TEM-Simulator, add voxel sizing information to the header.

Parameters
  • filename – The path to the raw tiltseries MRC that should be processed

  • apix – The voxel size

Returns: None

ets_generate_data.start_logger(logs_queue, logfile)

Start the multiprocessing logging process

Parameters
  • logs_queue – A multiprocessing queue to take in and digest log messages

  • logfile – The output text file to log to

Returns: The child process of the log listener

ets_process_data.py

This module is the entry point for processing simulated raw tilt stacks generated by

ets_generate_data.py.

Specifically, given a list of parameters in a configuration YAML file, project directories will be set up for specified cryo-ET processing softwares, i.e. EMAN2, and scripts generated to run all specified steps of the data processing process.

ets_process_data.main(args)

Create the processed_data directory if needed and call the proper handlers for all specified processors parsed from the configurations

Parameters

args – The parsed dictionary of input parameters from the configuration file.

Returns: None

ets_process_data.parse_inputs()

Instantiate and set up the command line arguments parser for the ets_process_data module

Returns: None

ets_process_data.save_processor_info(processed_data_dir, processor)

Save the processor arguments and the time the processor was run for future information

Parameters
  • processed_data_dir – The processed_data directory to save a log file to

  • processor – The processor object parsed from the YAML configs

Returns: None