Skip to content
— CH. 1 · DEFINING THE RECORD TYPE —

Record (computer science)

~5 min read · Ch. 1 of 6
6 sections
  • A date stored as a record contains three distinct fields: a numeric year, a string month, and a numeric day. This structure groups heterogeneous data types into a single unit. A circle record might hold a radius number alongside a center point containing x and y coordinates. Such composite structures differ fundamentally from arrays where every element shares the same type. In computer science, this collection of fields is often called a structure or user-defined type. Most modern programming languages allow developers to define these new record types with specific field identifiers. These identifiers serve as labels for accessing individual components within the larger container. Type theory sometimes prefers product types without named fields due to their simplicity. Yet proper record types remain essential in systems like System F-sub. They can express features found in object-oriented programming when they include function-typed fields.

  • Journal sheets from the 1880 United States census displayed rows of tabular data representing individual people. Each row functioned as a record corresponding to a single person. The concept traces back to accounting ledgers used since remote times. Babbage's Analytical Engine made the modern notion of records implicit through mechanical calculators in the 19th century. The original machine-readable medium was the punch card used during the 1890 United States census. Each punch card represented a single record with pre-determined document lengths. Hollerith punched cards from 1895 established that each card held one complete unit of information. Records were well-established by the first half of the 20th century when most processing relied on these cards. Specific columns assigned to specific fields ensured consistency across thousands of entries. When storage systems advanced to hard drives and magnetic tape, variable-length records became standard. This shift allowed record sizes to approximate the sum of field bytes rather than fixed physical limits. Early computers like the IBM 1620 provided hardware support for delimiting records and fields. Special instructions enabled copying such records directly within the machine architecture.

  • COBOL emerged as the first widespread programming language to support sophisticated record types. Its facilities allowed definitions of nested records with alphanumeric, integer, and fractional fields of arbitrary size. Fields could automatically format values by inserting currency signs or decimal points. A MOVE CORRESPONDING statement assigned matching fields between two records based on their names. FORTRAN up to version IV did not support record types initially. Later versions like FORTRAN 77 added this capability to the language. ALGOL 60 lacked record support until ALGOL 68 introduced it fully. The original Lisp language offered only built-in cons cells without true records. Pascal integrated record types into a logically consistent type system alongside other basic types. PL/I provided COBOL-style records for complex data handling. C languages utilize structs to provide the record concept. Most languages designed after Pascal including Ada, Modula, and Java supported these structures. Java introduced specific record features in version 17 to simplify data aggregate classes. These modern records make all fields final and private while generating constructors automatically. C# also introduced dedicated record syntax in its later iterations. Many programmers regard traditional records as obsolete due to object-oriented features surpassing them. Yet low-level assembly programming still relies on their minimal overhead.

  • Fields often store in consecutive memory locations following the declaration order within the record type. Two or more fields might occupy the same word of memory during this process. Systems programmers use this feature to access specific bits of a single word directly. Compilers frequently add invisible padding fields to comply with machine alignment constraints. A floating point field must sometimes occupy a single word to satisfy hardware requirements. Some languages implement records as arrays of addresses pointing to individual fields. Objects in object-oriented languages often require complicated implementations especially when multiple class inheritance exists. Self-defining records contain information identifying the record type itself. They may include offsets allowing elements to be stored in any order or omitted entirely. This metadata resembles UNIX file information regarding creation time and byte size. Various elements can follow one another in any sequence if each includes an identifier. Such flexibility allows for dynamic interpretation of the stored data structure without fixed positions.

  • Declaration of a record type specifies position, type, and name for each field. Construction of a record value involves initializing these field values explicitly. Read and write operations allow modification of specific components within the container. Comparison of two records checks for equality across all defined fields. Computation of a standard hash value enables efficient storage lookups in collections. Some languages provide facilities to enumerate fields for debugging or garbage collection services. Record subtyping allows adding or removing fields while maintaining compatibility. A record with x, y, and z fields belongs to the type expecting only x and y. Passing such a record to a function requiring fewer fields works because required items exist. Many practical implementations struggle with this variability but theoretical contexts embrace it fully. Most languages permit assignment between records having exactly the same type definition. Two types defined separately may remain distinct even if their fields match perfectly. Some languages match fields by names rather than positional indices like COBOL does. Order comparisons use lexicographic order based on individual field comparisons. PL/I allows both assignment types plus structure expressions where a record equals itself plus one.

  • A primary key remains unique throughout all stored records without exception. No duplicate exists for any primary key within an employee file containing numbers and names. The department code might be indexed as a secondary key since it is not necessarily unique. If unindexed, scanning the entire employee file becomes necessary to list staff in specific departments. Keys are chosen to minimize chances of multiple values mapping to one identifier. Salary fields rarely serve as keys because many employees share identical compensation figures. Query languages like SQL allow storage of data sets essentially acting as records in tables. Tables themselves act as records that may possess foreign keys referencing other tables. A foreign key references data located inside another table entirely. This relationship organizes row-based storage into coherent relational structures. Indexes store information in separate files to make lookups significantly faster. The concept extends from punched card columns to modern database management systems. Row-based storage keeps data organized as sequences of records for efficient retrieval.

Common questions

What is a record in computer science?

A record is a composite data type that groups heterogeneous data types into a single unit. This structure contains distinct fields such as a numeric year, string month, and numeric day to store related information together.

When did the concept of records originate in computing history?

The concept traces back to accounting ledgers used since remote times before Babbage's Analytical Engine made modern notions implicit in the 19th century. The original machine-readable medium was the punch card used during the 1890 United States census where each card represented a single record with pre-determined document lengths.

Which programming language first supported sophisticated record types widely?

COBOL emerged as the first widespread programming language to support sophisticated record types with facilities for nested records containing alphanumeric integer and fractional fields. Later versions like FORTRAN 77 added this capability while languages designed after Pascal including Ada Modula and Java also supported these structures.

How do records differ from arrays in memory storage?

Fields often store in consecutive memory locations following the declaration order within the record type unlike arrays where every element shares the same type. Systems programmers use this feature to access specific bits of a single word directly while compilers frequently add invisible padding fields to comply with machine alignment constraints.

What is the role of primary keys in database management systems?

A primary key remains unique throughout all stored records without exception so no duplicate exists for any primary key within an employee file containing numbers and names. Indexes store information in separate files to make lookups significantly faster when scanning entire files becomes necessary to list staff in specific departments.