sage_analysis.Model¶
This module contains the Model
class. The Model
class contains all the data
paths, cosmology etc for calculating galaxy properties.
To read SAGE data, we make use of specialized Data Classes (e.g.,
SageBinaryData
and:py:class:~sage_analysis.sage_hdf5.SageHdf5Data). We refer to
../user/data_class for more information about adding your own Data Class to ingest
data.
To calculate (and plot) extra properties from the SAGE output, we refer to ../user/calc.rst and ../user/plotting.rst.
-
class
sage_analysis.model.
Model
(sage_file: str, sage_output_format: Optional[str], label: Optional[str], first_file_to_analyze: int, last_file_to_analyze: int, num_sage_output_files: Optional[int], random_seed: Optional[int], IMF: str, plot_toggles: Dict[str, bool], plots_that_need_smf: List[str], sample_size: int = 1000, sSFRcut: float = -11.0)[source]¶ Handles all the galaxy data (including calculated properties) for a
SAGE
model.The ingestion of data is handled by inidivudal Data Classes (e.g.,
SageBinaryData
andSageHdf5Data
). We refer to ../user/data_class for more information about adding your own Data Class to ingest data.-
__init__
(sage_file: str, sage_output_format: Optional[str], label: Optional[str], first_file_to_analyze: int, last_file_to_analyze: int, num_sage_output_files: Optional[int], random_seed: Optional[int], IMF: str, plot_toggles: Dict[str, bool], plots_that_need_smf: List[str], sample_size: int = 1000, sSFRcut: float = -11.0)[source]¶ Sets the galaxy path and number of files to be read for a model. Also initialises the plot toggles that dictates which properties will be calculated.
Parameters: label (str, optional) – The label that will be placed on the plots for this model. If not specified, will use
FileNameGalaxies
read fromsage_file
.sage_output_format (str, optional) – If not specified will use the
OutputFormat
read fromsage_file
.num_sage_output_files (int, optional) – Specifies the number of output files that were generated by running SAGE. This can be different to the range specified by [first_file_to_analyze, last_file_to_analyze].
Notes
This variable only needs to be specified if
sage_output_format
issage_binary
.sample_size (int, optional) – Specifies the length of the
properties
attributes stored as 1-dimensionalndarray
. Theseproperties
are initialized usinginit_scatter_properties()
.sSFRcut (float, optional) – The specific star formation rate above which a galaxy is flagged as “star forming”. Units are log10.
-
calc_properties
(calculation_functions, gals, snapshot: int)[source]¶ Calculates galaxy properties for a single file of galaxies.
Parameters: - calculation_functions (dict [string, function]) – Specifies the functions used to calculate the properties. All functions in
this dictionary are called on the galaxies. The function signature is required
to be
func(Model, gals)
- gals (exact format given by the
Model
Data Class.) – The galaxies for this file. - snapshot (int) – The snapshot that we’re calculating properties for.
Notes
If
sage_output_format
issage_binary
,gals
is anumpy
structured array. Ifsage_output_format
: issage_hdf5
,gals
is an open HDF5 group. We refer to ../user/data_class for more information about adding your own Data Class to ingest data.- calculation_functions (dict [string, function]) – Specifies the functions used to calculate the properties. All functions in
this dictionary are called on the galaxies. The function signature is required
to be
-
calc_properties_all_files
(calculation_functions, snapshot: int, close_file: bool = True, use_pbar: bool = True, debug: bool = False)[source]¶ Calculates galaxy properties for all files of a single
Model
.Parameters: calculation_functions (dict [string, list(function, dict[string, variable])]) – Specifies the functions used to calculate the properties of this
Model
. The key of this dictionary is the name of the plot toggle. The value is a list with the 0th element being the function and the 1st element being a dictionary of additional keyword arguments to be passed to the function. The inner dictionary is keyed by the keyword argument names with the value specifying the keyword argument value.All functions in this dictionary for called after the galaxies for each sub-file have been loaded. The function signature is required to be
func(Model, gals, <Extra Keyword Arguments>)
.snapshot (int) – The snapshot that we’re calculating properties for.
close_file (boolean, optional) – Some data formats have a single file data is read from rather than opening and closing the sub-files in
read_gals()
. Hence once the properties are calculated, the file must be closed. This variable flags whether the data class specificclose_file()
method should be called upon completion of this method.use_pbar (Boolean, optional) – If set, uses the
tqdm
package to create a progress bar.debug (Boolean, optional) – If set, prints out extra useful debug information.
-
init_binned_properties
(bin_low: float, bin_high: float, bin_width: float, bin_name: str, property_names: List[str], snapshot: int)[source]¶ Initializes the
properties
(and respectivebins
) that will binned on some variable. For example, the stellar mass function (SMF) will describe the number of galaxies within a stellar mass bin.bins
can be accessed viaModel.bins["bin_name"]
and are initialized asndarray
.properties
can be accessed viaModel.properties["property_name"]
and are initialized usingnumpy.zeros
.Parameters: - bin_low, bin_high, bin_width (floats) – Values that define the minimum, maximum and width of the bins respectively.
This defines the binning axis that the
property_names
properties will be binned on. - bin_name (string) – Name of the binning axis, accessed by
Model.bins["bin_name"]
. - property_names (list of strings) – Name of the properties that will be binned along the defined binning axis.
Properties can be accessed using
Model.properties["property_name"]
; e.g.,Model.properties["SMF"]
would return the stellar mass function that is binned using thebin_name
bins. - snapshot (int) – The snapshot we’re initialising the properties for.
- bin_low, bin_high, bin_width (floats) – Values that define the minimum, maximum and width of the bins respectively.
This defines the binning axis that the
-
init_scatter_properties
(property_names: List[str], snapshot: int)[source]¶ Initializes the
properties
that will be extended asndarray
. These are used to plot (e.g.,) a the star formation rate versus stellar mass for a subset ofsample_size
galaxies. Initializes as emptyndarray
.Parameters: - property_names (list of strings) – Name of the properties that will be extended as
ndarray
. - snapshot (int) – The snapshot we’re initialising the properties for.
- property_names (list of strings) – Name of the properties that will be extended as
-
init_single_properties
(property_names: List[str], snapshot: int) → None[source]¶ Initializes the
properties
that are described using a single number. This is used to plot (e.g.,) a the sum of stellar mass across all galaxies. Initializes as0.0
.Parameters: - property_names (list of strings) – Name of the properties that will be described using a single number.
- snapshot (int) – The snapshot we’re initialising the properties for.
-
select_random_galaxy_indices
(inds: numpy.ndarray, num_inds_selected_already: int) → numpy.ndarray[source]¶ Selects random indices (representing galaxies) from
inds
. This method assumes that the total number of galaxies selected across all SAGE files analyzed issample_size
and that (preferably) these galaxies should be selected equally amongst all files analyzed.For example, if we are analyzing 8 SAGE output files and wish to select 10,000 galaxies, this function would hence select 1,250 indices from
inds
.If the length of
inds
is less than the number of requested values (e.g.,inds
only contains 1,000 values), then the next file analyzed will attempt to select 1,500 random galaxies (1,250 base plus an addition 250 as the previous file could not find enough galaxies).At the end of the analysis, if there have not been enough galaxies selected, then a message is sent to the user.
-
IMF
¶ The initial mass function.
Type: { "Chabrier"
,"Salpeter"
}
-
base_sage_data_path
¶ Base path to the output data. This is the path without specifying any extra information about redshift or the file extension itself.
Type: string
-
bins
¶ The bins used to bin some
properties
. Bins are initialized throughinit_binned_properties()
. Key is the name of the bin, (bin_name
ininit_binned_properties()
).Type: dict [string, ndarray
]
-
calculation_functions
¶ A dictionary of functions that are used to compute the properties of galaxies. Here, the string is the name of the toggle (e.g.,
"SMF"
), the value is a tuple containing the function itself (e.g.,calc_SMF()
), and another dictionary which specifies any optional keyword arguments to that function with keys as the name of variable (e.g.,"calc_sub_populations"
) and values as the variable value (e.g.,True
).Type: dict[str, tuple[func, dict[str, any]]]
-
first_file_to_analyze
¶ The first SAGE sub-file to be read. If
sage_output_format
issage_binary
, files read must be labelledsage_data_path
.XXX. Ifsage_output_format
issage_hdf5
, the file read will besage_data_path
and the groups accessed will be Core_XXX. In both cases,XXX
represents the numbers in the range [first_file_to_analyze
,last_file_to_analyze
] inclusive.Type: int
-
last_file_to_analyze
¶ The last SAGE sub-file to be read. If
sage_output_format
issage_binary
, files read must be labelledsage_data_path
.XXX. Ifsage_output_format
issage_hdf5
, the file read will besage_data_path
and the groups accessed will be Core_XXX. In both cases,XXX
represents the numbers in the range [first_file_to_analyze
,last_file_to_analyze
] inclusive.Type: int
-
num_gals_all_files
¶ Number of galaxies across all files. For HDF5 data formats, this represents the number of galaxies across all Core_XXX sub-groups.
Type: int
-
num_sage_output_files
¶ The number of files that SAGE wrote. This will be equal to the number of processors the SAGE ran with.
Notes
If
sage_output_format
issage_hdf5
, this attribute is not required.Type: int
-
output_path
¶ Path to where some plots will be saved. Used for
plot_spatial_3d()
.Type: string
-
parameter_dirpath
¶ The directory path to where the SAGE paramter file is located. This is only the base directory path and does not include the name of the file itself.
Type: str
-
plot_toggles
¶ Specifies which plots should be created for this model. This will control which properties should be calculated; e.g., if no stellar mass function is to be plotted, the stellar mass function will not be computed.
Type: dict[str, bool]
-
plots_that_need_smf
¶ Specifies the plot toggles that require the stellar mass function to be properly computed and analyzed. For example, plotting the quiescent fraction of galaxies requires knowledge of the total number of galaxies. The strings here must EXACTLY match the keys in
plot_toggles
.Type: list of ints
-
properties
¶ The galaxy properties stored across the input files and snapshots. These properties are updated within the respective
calc_<plot_toggle>
functions.The outside key is
"snapshot_XX"
whereXX
is the snapshot number for the property. The inner key is the name of the proeprty (e.g.,"SMF"
).Type: dict [string, dict [string, ndarray
]] or dict[string, dict[string, float]
-
random_seed
¶ Specifies the seed used for the random number generator, used to select galaxies for plotting purposes. If
None
, then uses default call toseed()
.Type: Optional[int]
-
sSFRcut
¶ The specific star formation rate above which a galaxy is flagged as “star forming”. Units are log10.
Type: float
-
sage_data_path
¶ Path to the output data. If
sage_output_format
issage_binary
, files read must be labelledsage_data_path
.XXX. Ifsage_output_format
issage_hdf5
, the file read will besage_data_path
and the groups accessed will be Core_XXX at snapshotsnapshot
. In both cases,XXX
represents the numbers in the range [first_file_to_analyze
,last_file_to_analyze
] inclusive.Type: string
-
sage_output_format
¶ The output format SAGE wrote in. A specific Data Class (e.g.,
SageBinaryData
andSageHdf5Data
) must be written and used for eachsage_output_format
option. We refer to ../user/data_class for more information about adding your own Data Class to ingest data.Type: { "sage_binary"
,"sage_binary"
}
-
sample_size
¶ Specifies the length of the
properties
attributes stored as 1-dimensionalndarray
. Theseproperties
are initialized usinginit_scatter_properties()
.Type: int
-
snapshot
¶ Specifies the snapshot to be read. If
sage_output_format
issage_hdf5
, this specifies the HDF5 group to be read. Otherwise, ifsage_output_format
issage_binary
, this attribute will be used to indexredshifts
and generate the suffix forsage_data_path
.Type: int
-
volume
¶ Volume spanned by the trees analyzed by this model. This depends upon the number of files processed,
[:py:attr:`~first_file_to_analyze`, :py:attr:`~last_file_to_analyze`]
, relative to the total number of files the simulation spans over,num_sim_tree_files
.Notes
This is not necessarily
box_size
cubed. It is possible that this model is only analysing a subset of files and hence the volume will be less.Type: volume
-