Get your training data ready for your private LLMs
Reduce time spent on data cleaning by up to 90%
Aggregate knowledge from multi-formats including PDFs and HTMLs
1from uniflow import ExtractPDFClient
2
3client = ExtractPDFClient()
4
5output = client.run(data)