Utilities (titli.utils)
The titli.utils module provides utility functions and classes for data handling, loss computation, and dataset management.
Overview
This module contains helper utilities that support the main feature extraction and IDS functionality.
Datasets
StreamingCSVDataset
Loss Functions
RMSELoss
Data Utilities
Directory Management
Usage Examples
Using StreamingCSVDataset
Load and iterate over CSV data with separate feature and label files:
from titli.utils import StreamingCSVDataset
from torch.utils.data import DataLoader
# Create dataset with feature and label CSV files
dataset = StreamingCSVDataset(
feature_csv_path="features.csv",
label_csv_path="labels.csv",
max_samples=100000,
label_column=0 # Column index containing labels
)
# Create data loader
loader = DataLoader(
dataset,
batch_size=32,
shuffle=False,
num_workers=2
)
# Iterate over batches
for features, labels in loader:
# Process batch
pass
Standard Pattern with IDS Models
The standard workflow integrating StreamingCSVDataset with IDS models:
from titli.fe import AfterImage
from titli.utils import StreamingCSVDataset
from torch.utils.data import DataLoader
from titli.ids import OCSVM
import torch
# Step 1: Extract features to CSV
fe = AfterImage(file_path="traffic.pcap")
fe.extract_features(output_path="features.csv")
# Step 2: Create DataLoader
dataset = StreamingCSVDataset(
feature_csv_path="features.csv",
label_csv_path="labels.csv",
max_samples=100000,
label_column=0
)
train_loader = DataLoader(dataset, batch_size=32, shuffle=False, num_workers=2)
# Step 3: Train and evaluate model
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
ids = OCSVM(dataset_name="my_dataset", input_size=dataset.input_size, device=device)
ids.train_model(train_loader)
ids.evaluate(train_loader)
Using RMSELoss
Use RMSE as a loss function in PyTorch:
import torch
from titli.utils import RMSELoss
# Initialize loss function
criterion = RMSELoss()
# Compute loss
predictions = model(inputs)
loss = criterion(predictions, targets)
# Backward pass
loss.backward()
Directory Management
Create organized directory structures for experiments:
from titli.utils.data import create_directories
# Create standard directory structure
create_directories("my_experiment")
# This creates:
# - my_experiment/models/
# - my_experiment/results/
# - my_experiment/logs/
Advanced Usage
Custom Dataset Processing
Process large CSV files in chunks:
from titli.utils import StreamingCSVDataset
import pandas as pd
# Create streaming dataset
dataset = StreamingCSVDataset(
csv_path="large_file.csv",
chunk_size=10000
)
# Process in chunks to avoid memory issues
for i, (features, labels) in enumerate(dataset):
print(f"Processing chunk {i}")
# Your processing logic here
Custom Loss Functions
Combine with other PyTorch loss functions:
import torch
import torch.nn as nn
from titli.utils import RMSELoss
class CombinedLoss(nn.Module):
def __init__(self, alpha=0.5):
super().__init__()
self.rmse = RMSELoss()
self.mse = nn.MSELoss()
self.alpha = alpha
def forward(self, pred, target):
return self.alpha * self.rmse(pred, target) + \
(1 - self.alpha) * self.mse(pred, target)
criterion = CombinedLoss(alpha=0.7)