Creating Large Scale Corrupted Test Datasets - Hoomet

Forums » Off-Topic Discussions

Back To Topics

Creating Large Scale Corrupted Test Datasets

peja tan
- 40 posts
April 27, 2026 5:16 PM PDT

Hi everyone,

my research team is currently training a machine learning model to detect anomalies in data transmissions across satellite links.

To improve the model's accuracy, we need thousands of examples of both healthy and damaged files to use as training labels.

Manually breaking files one by one is simply not an option for a dataset of this massive size.

We really need a guide or a python library on how to generate corrupted test data files at scale. Any help from the data science community would be amazing. Thanks in advance.

Copyright ©2026 - Privacy - Terms of Service - Contact