Navigation Bar

Logo
AnyParser API (YC S23) - The first LLM for document parsing with accuracy and speed | Product Hunt

AnyParser On-prem Enterprise Security Whitepaper

Oct 23, 2024
Back to Blogs
Authors: 
CambioML Team @  CambioML

Executive Summary

AnyParser On-prem is a cutting-edge document parsing solution designed to meet the stringent security requirements of modern enterprises. This whitepaper outlines the robust security measures implemented in AnyParser On-prem, ensuring data privacy, regulatory compliance, and operational efficiency.

1. Introduction

In today's data-driven business landscape, efficient document parsing is crucial for streamlining operations and decision-making processes. However, the use of cloud-based language models for parsing sensitive documents introduces significant data privacy risks. AnyParser On-prem addresses these challenges by providing a secure, on-premises solution that leverages the power of advanced language models while maintaining complete control over sensitive data.

Why CambioML AnyParser Stands Out: Speed and Accuracy

The CambioML model excels in both speed and accuracy, striking the ideal balance for many business needs. It's faster than most large proprietary models while providing higher accuracy compared to traditional Optical Character Recognition (OCR) systems. This means your business can achieve the best of both worlds—quick, reliable document parsing without sacrificing performance.

AnyParser Performance Graph

The graph above compares the throughput and accuracy of different document parsing models.

  • X-axis (Horizontal): Represents the "Average Throughput (tokens/s)". This measures how quickly each model can process text, with higher values indicating faster processing speeds.
  • Y-axis (Vertical): Represents "Accuracy". This measures how correctly each model interprets and extracts information from documents, with higher values indicating better performance.

Anyparser outperforms other models in both accuracy (about 0.82) and throughput (around 160 tokens/s), offering an optimal balance for enterprise document parsing needs.

2. Security Architecture Overview

AnyParser On-prem is designed with a security-first approach, offering an entire stack from the model itself to the serving infrastructure. This comprehensive solution can be seamlessly provisioned into private cloud environments such as AWS, GCP, and Azure.

2.1 Deployment Model

The on-premises deployment model ensures that all data processing occurs within the organization's private environment. This approach eliminates the need to transmit sensitive documents outside the company's control, addressing concerns related to data sovereignty and regulatory compliance.

2.2 Infrastructure Security

AnyParser On-prem leverages industry-standard tools and best practices for secure infrastructure deployment:

  • Terraform is used for setting up and managing cloud infrastructure, ensuring consistency and reducing the risk of misconfigurations.
  • Docker containers are employed to isolate the application and its dependencies, enhancing security and portability.
  • Deployment to EC2 or EKS is managed through Terraform, maintaining infrastructure-as-code principles and enabling version control of the deployment process.

2.3 Network Security

To ensure secure access and communication:

  • Kong or Nginx is set up as the ingress controller, providing robust traffic management and security features.
  • DNS configuration is implemented for seamless internal access, reducing exposure to external threats.
Cambio On-prem Design

3. Data Privacy and Protection

AnyParser On-prem addresses the critical need for data privacy in document parsing:

  • All data remains within the organization's private environment, eliminating the risks associated with cloud-based proprietary models.
  • The solution complies with regulatory requirements and internal data protection policies by ensuring that sensitive documents are not sent outside the private environment.

4. Access Control and Authentication

AnyParser On-prem implements robust access control and authentication mechanisms, including:

  • Role-based access control (RBAC) to ensure that only authorized personnel can access the system and parsed data.
  • Multi-factor authentication (MFA) for enhanced user verification.
  • Regular access audits and logging to monitor and review system usage.

5. Encryption

To further protect sensitive data, AnyParser On-prem offers customer an option to implement:

  • Data encryption at rest using industry-standard encryption algorithms.
  • Encryption in transit using TLS/SSL protocols for all network communications.
  • Secure key management practices to safeguard encryption keys.

6. Continuous Monitoring and Incident Response

To maintain a robust security posture, AnyParser On-prem offers customer an option to implement:

  • Real-time monitoring of the infrastructure and application for potential security threats.
  • Regular vulnerability assessments and penetration testing.
  • A well-defined incident response plan to address and mitigate security incidents promptly.

7. Future Security Enhancements

CambioML is committed to continuous innovation in security. The upcoming bare-metal Kubernetes-based deployment option will provide enterprises with even greater control and flexibility in their infrastructure choices. This enhancement will allow organizations to deploy AnyParser On-prem on their own hardware, further strengthening data control, reducing latency, and optimizing operational environments.

8. Performance and Scalability

While maintaining stringent security measures, AnyParser On-prem does not compromise on performance:

  • The solution offers an optimal balance of speed and accuracy, outperforming other models in both metrics.
  • With an accuracy of about 0.82 and a throughput of around 160 tokens/s, AnyParser On-prem meets enterprise document parsing needs efficiently.

Conclusion

AnyParser On-prem provides a secure, high-performance, and accurate document parsing solution that addresses the critical security and privacy concerns of modern enterprises. By offering an on-premises deployment model, robust infrastructure security, and a commitment to continuous improvement, AnyParser On-prem enables organizations to harness the power of advanced document parsing while maintaining complete control over their sensitive data.

Call to Action

For organizations looking to transform their document workflows without compromising on security, AnyParser On-prem offers a powerful and secure solution. Contact CambioML today to  schedule a demo or  start a trial and experience the benefits of secure, efficient document parsing.

Footer