Share data in the Splunk Machine Learning Toolkit
What data is collected
The Splunk Machine Learning Toolkit collects the following basic usage information:
| Component | Description | Example |
|---|---|---|
algo_name |
Name of algorithm used in fit or apply. |
JSON
|
app_context |
Name of the app from which search is run. |
JSON
|
apply_time |
Time the apply command took. |
JSON
|
app.session.Splunk_ML_Toolkit.changeSmartAssistantStep |
User progress through an MLTK Smart Assistant. |
JSON
|
app.session.Splunk_ML_Toolkit.createExperiment |
User creating an MLTK Experiment. |
JSON
|
app.session.Splunk_ML_Toolkit.createExperimentAlert |
Users creating alerts for MLTK Experiments. |
JSON
|
app.session.Splunk_ML_Toolkit.loadAssistant |
Number of times the user has loaded a MLTK Assistant. |
JSON
|
app.session.Splunk_ML_Toolkit.saveExperiment |
Users saving their work in MLTK Experiments. |
JSON
|
app.session.Splunk_ML_Toolkit.scheduleExperimentTraining |
Users scheduling model re-training for MLTK Experiments. |
JSON
|
col_dimension |
Collects dimension of the dataset from model schema. Triggered during apply. |
JSON
|
columns |
The number of columns being run through fit command. |
JSON
|
command |
fit, apply, or score |
JSON
JSON
JSON
|
csv_parse_time |
CSV parse time. |
JSON
|
csv_read_time |
CSV read time. |
JSON
|
csv_render_time |
CSV render time. |
JSON
|
deployment.app |
Apps installed per Splunk instance. |
JSON
|
df_shape |
Shape of data input received from splunk. Triggered during apply. |
JSON
|
example_name |
Name of the Showcase example being run. |
JSON
|
experiment_id |
ID of the fit and apply run on the Experiments page. All preprocessing steps and final fit have the same ID. |
JSON
|
fit_time |
Amount of time it took to run the fit command. |
JSON
|
full_punct |
The punct of the data during fit or apply. |
JSON
|
handle_time |
Time for the handler to handle the data. |
JSON
|
metrics_type |
Collects the type of request sent. Used to differentiate model upload and model inference call flows.
Contains two values:
|
JSON
|
modelId |
Model ID in which user saves their model. |
JSON
|
model_upload |
Monitors the model upload process to determine if the model has been successfully uploaded and is ready for inference. |
JSON
|
numColumns |
Total number of columns in the dataset. |
JSON
|
numRows |
Total number of rows (events) in the dataset. |
JSON
|
num_fields |
Total number of fields. |
JSON
|
num_fields_fs |
Number of fields that have the fs for Field Selector prefix. |
JSON
|
num_fields_PC |
Number of fields that have the PC for preprocessed prefix. |
JSON
|
num_fields_prefixed |
Total number of preprocessed fields. |
JSON
|
num_fields_RS |
Number of fields that have the RS for Robust Scaler prefix. |
JSON
|
num_fields_SS |
Number of fields that have the SS for Standard Scaler prefix. |
JSON
|
num_fields_tfidf |
Number of fields that have used term frequency-inverse document frequency preprocessing. |
JSON
|
onnx_input_shape |
Shape of input data stored in the onnx model schema. Triggered during apply time. |
JSON
|
onnx_model_size_on_disk |
Total size in MB taken up by the model file on the disk after encoding. Triggered during model upload. |
JSON
|
onnx_upload_time |
Time taken to upload an onnx model file from UI. Triggered during model upload. |
JSON
|
orig_sourcetype |
The original sourcetype of the machine data. |
JSON
|
params |
Optional parameters used in fit step. |
JSON
|
partialFit |
Whether or not the fit is a type of partial fit action. |
JSON
|
PID |
Process identifier associated with the command. |
JSON
|
pipeline_stage |
Each preprocessing step on the Experiments page is assigned a number starting from 0. This helps determine the order of the preprocessing steps and length of the pipeline. |
JSON
|
rows |
The number of rows being run through fit command. |
JSON
|
scoringName |
Name of the scoring operation if whitelisted. If name is not whitelisted, logs the hash of the scoringName. |
CODE
|
scoringTimeSec |
Time taken by the scoring operation. |
CODE
|
UUID |
Universally unique identifier associated with command. This is 128-bit and used to keep each fit/apply unique. |
JSON
|