Data integrity is the assurance that GxP data is complete, consistent, accurate, and trustworthy throughout its entire lifecycle — from initial creation through archiving and disposal. In regulatory terms it means data must satisfy ALCOA+ principles at every stage: whether generated on paper, in an instrument, or by a computerized system. Regulators treat data integrity not as a technical IT function but as a fundamental quality obligation. Failures compromise product quality decisions, patient safety, and the reliability of every batch record, laboratory result, and regulatory submission built on that data.
ALCOA+ defines the required attributes of GxP data. ALCOA stands for Attributable, Legible, Contemporaneous, Original, and Accurate. The plus adds Complete, Consistent, Enduring, and Available. Each attribute carries practical meaning: Attributable means traceable to a named individual. Contemporaneous means recorded at the time of the activity. Original means the first capture is preserved — not reconstructed. Enduring means data survives system migrations and media changes. Available means retrievable on demand for the full retention period. Together these attributes define what regulators examine during every data integrity inspection.
Data integrity and data quality are related but distinct concepts. Data integrity ensures that records are complete, unaltered, and attributable — that what was recorded reflects what actually occurred, with no gaps or unauthorized modifications. Data quality addresses whether the underlying process produces accurate, scientifically meaningful results. A system can produce high-quality results but record them with integrity failures — for example, a correct analytical result entered retrospectively without a contemporaneous timestamp. Regulators require both: valid science supported by records that demonstrate it was conducted as documented.
Data integrity failures dominate FDA warning letters because they undermine the entire basis of GMP oversight. When records cannot be trusted — because audit trails are missing, data was deleted and rerun, or results were entered retrospectively — the FDA cannot confirm manufacturing and testing occurred as documented. Approximately 40% of FDA citations in recent years have involved data integrity issues. The agency treats systemic data integrity failure as more serious than isolated production deviations because it calls every record produced at the site into question.
Data governance is the organizational framework — policies, procedures, roles, and responsibilities — governing how data is generated, captured, reviewed, and retained across its lifecycle. Unlike technical controls, data governance addresses the human and process dimensions of data integrity: who is accountable for record accuracy, how management oversees data practices, and how integrity risks are identified and escalated. Regulators increasingly cite inadequate data governance as a root cause finding in warning letters — recognizing that technical system controls alone cannot sustain data integrity without a supporting quality culture.
The data lifecycle encompasses every stage from data creation through destruction or archiving after the required retention period. Stages include: design of data capture processes, data generation and recording, review and approval, storage and backup, retrieval for batch release or submission, transfer or migration between systems, archiving, and eventual destruction. Regulatory expectations apply at every stage — not only at the point of recording. Weaknesses in archiving, migration, or backup verification are as audit-significant as weaknesses in initial data capture.
Chromatography data systems are among the most scrutinized systems in data integrity enforcement. Common violations include: deleting and rerunning injections without documentation, manipulating integration parameters retrospectively without audit trail capture, using shared administrator accounts, and disabling audit trails during analysis. CDS raw data — the original detector signal — must be preserved in its native format, unmodified, with complete processing audit trails. FDA warning letters involving CDS consistently cite inability to reconstruct the full analytical sequence from preserved raw data.
Out-of-specification results present high data integrity risk because of organizational pressure to repeat tests or invalidate results without adequate scientific justification. Every OOS result must be investigated before any retesting occurs — retesting without a documented invalidation rationale is a data integrity failure. Deleting original OOS injections, averaging results to obscure failures, or initiating unofficial retesting are among the most serious violations. FDA warning letters consistently cite inadequate OOS investigation as evidence of broader data integrity culture failures at a facility.
Paper records carry the same data integrity obligations as electronic records. Every entry must be made in ink at the time of the activity, be attributable to the individual who made it, and be preserved without obliteration — errors crossed out with a single line, initialed, dated, and explained. Pre-dating, post-dating, writing over previous entries, use of correction fluid, and copying data from unofficial scratchpads are all data integrity violations. Paper systems are often harder to control than electronic systems because they lack automated timestamps and system-enforced access controls.
A LIMS supports data integrity when validated and configured to enforce individual user attribution, tamper-evident audit trails for all data entries and modifications, system-time locked timestamps that cannot be user-adjusted, and controls preventing deletion or overwriting of original results. Audit trails must be generated automatically — operators should never bypass or disable audit trail capture. Regular LIMS audit trail review, defined in an SOP with documented frequency, is a specific FDA inspection expectation that many sites fail to fulfill.
Remediating a data integrity failure requires systematic investigation of root cause and extent, followed by documented corrective and preventive actions addressing both technical controls and quality culture. Typical steps include: retrospective review of affected records, root cause analysis identifying systemic gaps, immediate technical control improvements, comprehensive retraining, enhanced supervisory oversight, and a data integrity risk assessment across all other systems. Regulators assess the credibility and thoroughness of remediation responses — superficial CAPA plans that address individual incidents without systemic analysis consistently fail to satisfy inspection follow-up.
Backup and recovery are data integrity obligations, not merely IT best practices. GxP data must be backed up on a defined schedule to a separate, secure location — not the same server hosting primary records. Recovery must be tested periodically to confirm backups are complete and restorable. Failure to test recovery has been cited in FDA observations: an untested backup is an unverified backup. Backup copies must also be read-only — independently alterable backups undermine the integrity of the primary record archive.
Data integrity obligations extend to contract laboratories, CMOs, and other third parties generating GxP data on a company's behalf. The regulated company remains responsible for the integrity of all data used to support product releases and regulatory submissions — regardless of origin. Supplier audits must include data integrity assessment: audit trail configuration, user access controls, and laboratory data practices. A contract laboratory data integrity failure is the originating company's compliance failure — a principle explicitly stated in both FDA and MHRA guidance.
Data integrity culture is the organizational environment where personnel understand their integrity obligations, feel empowered to report concerns without fear of retaliation, and are not pressured to falsify or conceal failures. Regulators have increasingly moved from citing technical control failures alone to identifying inadequate management oversight and quality culture as root causes. An environment where OOS results create excessive pressure, where sample rerunning is informal practice, or where audit trail review is superficial signals a culture-level data integrity risk that technical controls alone cannot remedy.
FDA guidance distinguishes between static and dynamic electronic records. Static records are fixed representations — a PDF or printed report — that cannot be reprocessed. Dynamic records retain original data relationships, calculations, and metadata in their native format and can be reprocessed or queried. Chromatography raw data is inherently dynamic: the original detector signal must be preserved in its native CDS format, not only as a printed chromatogram. Storing only static outputs and discarding the native dynamic data is a data integrity failure that FDA consistently cites in CDS-related observations.
Spreadsheets used for GxP calculations present significant data integrity risks because standard applications lack built-in audit trails, access controls, and formula protection. Any user with file access can modify data, alter formulas, or delete entries without a traceable record. Where spreadsheets are used in GxP contexts, compensating controls are required: file access restrictions, version control, formula protection, and periodic validation of the spreadsheet itself. FDA 483 observations frequently cite uncontrolled spreadsheets as a data integrity gap — particularly in laboratory and manufacturing calculations.
Data integrity training must build genuine understanding of why integrity matters and what specific behaviors constitute violations. Personnel must understand ALCOA+ principles, what constitutes unauthorized data modification, how audit trails function in their systems, and the consequences of integrity failures for patients and the organization. Training must be role-specific and system-specific — generic annual awareness training does not satisfy regulatory expectations. Training records must be maintained and linkable to the specific systems and procedures each individual operates during an inspection.
When a data integrity failure is identified — through internal audit, audit trail review, or regulatory inspection — a formal investigation is required before corrective action is finalized. The investigation must determine the full extent of affected records, root cause, whether the failure was deliberate or systemic, and the potential impact on product quality decisions. Scope limitation is a common deficiency: companies frequently investigate a specific incident without determining whether the same failure pattern exists across other systems, sites, or time periods.
FDA and MHRA approach data integrity from the same principles but with different scope. FDA's 2018 guidance addresses data integrity within the cGMP framework through a question-and-answer format focused on 21 CFR Parts 211 and 212. MHRA's 2018 GxP Data Integrity Guidance is broader: it covers all GxP environments including GCP and GLP, includes explicit data governance and management accountability requirements, and extends expectations to suppliers. Organizations operating across both jurisdictions must address MHRA's additional governance and supplier oversight requirements that go beyond FDA's cGMP-specific scope.
Inspection readiness for data integrity means demonstrating active, ongoing controls — not a compliance posture assembled before an inspection. Key readiness indicators include: documented data governance policies, SOPs governing audit trail review with evidence of routine execution, validated systems with confirmed audit trail configuration, a computerized system inventory with integrity risk classifications, completed data integrity risk assessments, and training records linking personnel to specific system SOPs. Inspectors look for evidence of a functioning, embedded program — not documentation assembled retrospectively in response to an announced inspection.