List of important publications in computer science


This is a list of important publications in computer science, organized by field.
Some reasons why a particular publication might be regarded as important:

''Computing Machinery and Intelligence''

Description: This paper discusses the various arguments on why a machine can not be intelligent and asserts that none of those arguments are convincing. The paper also suggested the Turing test, which it calls "The Imitation Game" as according to Turing it is pointless to ask whether or not a machine can think intelligently, and checking if it can act intelligently is sufficient.

''A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence''

Description: This summer research proposal inaugurated and defined the field. It contains the first use of the term artificial intelligence and this succinct description of the philosophical foundation of the field: "every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it." The proposal invited researchers to the Dartmouth conference, which is widely considered the "birth of AI".

''Fuzzy sets''

Description: The seminal paper published in 1965 provides details on the mathematics of fuzzy set theory.

''Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference''

Description: This book introduced Bayesian methods to AI.

''Artificial Intelligence: A Modern Approach''

Description: The standard textbook in Artificial Intelligence. lists over 1100 colleges.

Machine learning

''An Inductive Inference Machine''

Description: The first paper written on machine learning. Emphasized the importance of training sequences, and the use of parts of previous solutions to problems in constructing trial solutions to new problems.

''Language identification in the limit''

Description: This paper created Algorithmic learning theory.

''On the uniform convergence of relative frequencies of events to their probabilities''

Description: Computational learning theory, VC theory, statistical uniform convergence and the VC dimension.

''A theory of the learnable''

Description: The Probably approximately correct learning framework.

''Learning representations by back-propagating errors''

Seppo Linnainmaa's reverse mode of automatic differentiation is used in experiments by David Rumelhart, Geoff Hinton and Ronald J. Williams to learn internal representations.

''Induction of Decision Trees''

Description: Decision Trees are a common learning algorithm and a decision representation tool. Development of decision trees was done by many researchers in many areas, even before this paper. Though this paper is one of the most influential in the field.

''Learning Quickly When Irrelevant Attributes Abound: A New Linear-threshold Algorithm''

Description: One of the papers that started the field of on-line learning. In this learning setting, a learner receives a sequence of examples, making predictions after each one, and receiving feedback after each prediction. Research in this area is remarkable because the algorithms and proofs tend to be very simple and beautiful, and the model makes no statistical assumptions about the data. In other words, the data need not be random, but can be chosen arbitrarily by "nature" or even an adversary. Specifically, this paper introduced the winnow algorithm.

''Learning to predict by the method of Temporal difference''

Description: The Temporal difference method for reinforcement learning.

''Learnability and the Vapnik–Chervonenkis dimension''

Description: The complete characterization of PAC learnability using the VC dimension.

''Cryptographic limitations on learning boolean formulae and finite automata ''

Description: Proving negative results for PAC learning.

''The strength of weak learnability''

Description: Proving that weak and strong learnability are equivalent in the noise free PAC framework. The proof was done by introducing the boosting method.

''A training algorithm for optimum margin classifiers''

Description: This paper presented support vector machines, a practical and popular machine learning algorithm. Support vector machines often use the kernel trick.

''A fast learning algorithm for deep belief nets''

Description: This paper presented a tractable greedy layer-wise learning algorithm for deep belief networks which led to great advancement in the field of deep learning.

''Knowledge-based analysis of microarray gene expression data by using support vector machines''

Description: The first application of supervised learning to gene expression data, in particular Support Vector Machines. The method is now standard, and the paper one of the most cited in the area.

Collaborative networks

''On the translation of languages from left to right''

Description: LR parser, which does bottom up parsing for deterministic context-free languages. Later derived parsers, such as the LALR parser, have been and continue to be standard practice, such as in Yacc and descendants.

''Semantics of Context-Free Languages.''

Description: About grammar attribution, the base for yacc's s-attributed and zyacc's LR-attributed approach.

''A program data flow analysis procedure''

Description: From the abstract: "The global data relationships in a program can be exposed and codified by the static analysis methods described in this paper. A procedure is given which determines all the definitions which can possibly reach each node of the control flow graph of the program and all the definitions that are live on each edge of the graph."

''A Unified Approach to Global Program Optimization''

Description: Formalized the concept of data-flow analysis as fixpoint computation over lattices, and showed that most static analyses used for program optimization can be uniformly expressed within this framework.

''YACC: Yet another compiler-compiler''

Description: Yacc is a tool that made compiler writing much easier.

''gprof: A Call Graph Execution Profiler''

Description: The gprof profiler

''Compilers: Principles, Techniques and Tools ''

Description: This book became a classic in compiler writing. It is also known as the, after the dragon that appears on its cover.

Computer architecture

''Colossus computer''

Description: The Colossus machines were early computing devices used by British codebreakers to break German messages encrypted with the Lorenz Cipher during World War II. Colossus was an early binary electronic digital computer. The design of Colossus was later described in the referenced paper.

''First Draft of a Report on the EDVAC''

Description: It contains the first published description of the logical design of a computer using the stored-program concept, which has come to be known as the von Neumann architecture.

''Architecture of the IBM System/360''

Description: The IBM System/360 is a mainframe computer system family announced by IBM on April 7, 1964. It was the first family of computers making a clear distinction between architecture and implementation.

''The case for the reduced instruction set computer''

Description: The reduced instruction set computer CPU design philosophy. The RISC is a CPU design philosophy that favors a reduced set of simpler instructions.

''Comments on "the Case for the Reduced Instruction Set Computer"''

Description:

''The CRAY-1 Computer System''

Description: The Cray-1 was a supercomputer designed by a team including Seymour Cray for Cray Research. The first Cray-1 system was installed at Los Alamos National Laboratory in 1976, and it went on to become one of the best known and most successful supercomputers in history.

''Validity of the Single Processor Approach to Achieving Large Scale Computing Capabilities''

Description: The Amdahl's Law.

''A Case for Redundant Arrays of Inexpensive Disks (RAID)''

Description: This paper discusses the concept of RAID disks, outlines the different levels of RAID, and the benefits of each level. It is a good paper for discussing issues of reliability and fault tolerance of computer systems, and the cost of providing such fault-tolerance.

''The case for a single-chip multiprocessor''

Description: This paper argues that the approach taken to improving the performance of processors by adding multiple instruction issue and out-of-order execution cannot continue to provide speedups indefinitely. It lays out the case for making single chip processors that contain multiple "cores". With the mainstream introduction of multicore processors by Intel in 2005, and their subsequent domination of the market, this paper was shown to be prescient.

Computer graphics

''The Rendering Equation''

Description: The Academy of Motion Picture Arts and Sciences cited this paper as a "milestone in computer graphics".

''Sketchpad, a Man-Machine Graphical Communication System''

Description: One of the founding works on computer graphics.

Computer vision

'' The Phase Correlation Image Alignment Method ''

Description: A correlation method based upon the inverse Fourier transform

''Determining Optical Flow''

Description: A method for estimating the image motion of world points between 2 frames of a video sequence.

''An Iterative Image Registration Technique with an Application to Stereo Vision''

Description: This paper provides efficient technique for image registration

''The Laplacian Pyramid as a compact image code''

Description: A technique for image encoding using local operators of many scales.

''Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images''

Description: introduced 1) MRFs for image analysis
2) the Gibbs sampling which revolutionized computational Bayesian statistics and thus had paramount impact in many other fields in addition to Computer Vision.

''Snakes: Active contour models''

Description: An interactive variational technique for image segmentation and visual tracking.

''Condensation – conditional density propagation for visual tracking''

Description: A technique for visual tracking

''Object recognition from local scale-invariant features ''

Description: A technique for robust feature description

Concurrent, parallel, and distributed computing

Topics covered: concurrent computing, parallel computing, and distributed computing.

Databases

''A relational model for large shared data banks''

Description: This paper introduced the relational model for databases. This model became the number one model.

''Binary B-Trees for Virtual Memory''

Description: This paper introduced the B-Trees data structure. This model became the number one model.

''Relational Completeness of Data Base Sublanguages''

Description: Completeness of Data Base Sublanguages

''The Entity Relationship Model – Towards a Unified View of Data''

Description: This paper introduced the entity-relationship diagram method of database design.

''SEQUEL: A structured English query language''

Description: This paper introduced the SQL language.

''The notions of consistency and predicate locks in a database system''

Description: This paper defined the concepts of transaction, consistency and schedule. It also argued that a transaction needs to lock a logical rather than a physical subset of the database.

''Federated database systems for managing distributed, heterogeneous, and autonomous databases''

Description: Introduced federated database systems concept leading huge impact on data interoperability and integration of hetereogeneous data sources.

''Mining association rules between sets of items in large databases''

Description: Association rules, a very common method for data mining.

History of computation

''The Computer from Pascal to von Neumann''

Description: Perhaps the first book on the history of computation.

''A History of Computing in the Twentieth Century''

edited by:
Description: Several chapters by pioneers of computing.

Information retrieval

''A Vector Space Model for Automatic Indexing''

Description: Presented the vector space model.

''Extended Boolean Information Retrieval''

Description: Presented the inverted index

''A Statistical Interpretation of Term Specificity and Its Application in Retrieval''

Description: Conceived a statistical interpretation of term specificity called Inverse document frequency, which became a cornerstone of term weighting.

Networking

Data Communications and Networking

Description: This book presents a comprehensive and accessible approach to data communications and networking that has made this book a favorite with students and professionals alike. More than 830 figures and 150 tables accompany the text and provide a visual and intuitive opportunity for understanding the material.

Operating systems

''An experimental timesharing system.''

Description: This paper discuss time-sharing as a method of sharing computer resource. This idea changed the interaction with computer systems.

''The Working Set Model for Program Behavior''

Description: The beginning of cache. For more information see .

''Virtual Memory, Processes, and Sharing in MULTICS''

Description: The classic paper on Multics, the most ambitious operating system in the early history of computing. Difficult reading, but it describes the implications of trying to build a system that takes information sharing to its logical extreme. Most operating systems since Multics have incorporated a subset of its facilities.

The nucleus of a multiprogramming system

Description: Classic paper on the extensible nucleus architecture of the RC 4000 multiprogramming system, and what became known as the operating system kernel and microkernel architecture.

Operating System Principles

Description: The first comprehensive textbook on operating systems. Includes the first monitor notation.

''A note on the confinement problem''

Description: This paper addresses issues in constraining the flow of information from untrusted programs. It discusses covert channels, but more importantly it addresses the difficulty in obtaining full confinement without making the program itself effectively unusable. The ideas are important when trying to understand containment of malicious code, as well as aspects of trusted computing.

'' The UNIX Time-Sharing System''

Description: The Unix operating system and its principles were described in this paper. The main importance is not of the paper but of the operating system, which had tremendous effect on operating system and computer technology.

''Weighted voting for replicated data''

Description: This paper describes the consistency mechanism known as quorum consensus. It is a good example of algorithms that provide a continuous set of options between two alternatives. There have been many variations and improvements by researchers in the years that followed, and it is one of the consistency algorithms that should be understood by all. The options available by choosing different size quorums provide a useful structure for discussing of the core requirements for consistency in distributed systems.

''Experiences with Processes and Monitors in Mesa''

Description: This is the classic paper on synchronization techniques, including both alternate approaches and pitfalls.

''Scheduling Techniques for Concurrent Systems''

Description: Algorithms for coscheduling of related processes were given

''A Fast File System for UNIX''

Description: The file system of UNIX. One of the first papers discussing how to manage disk storage for high-performance file systems. Most file-system research since this paper has been influenced by it, and most high-performance file systems of the last 20 years incorporate techniques from this paper.

''The Design of the UNIX Operating System''

This definitive description principally covered the System V Release 2 kernel, with some new features from Release 3 and BSD.

''The Design and Implementation of a Log-Structured File System''

Description: Log-structured file system.

''Microkernel operating system architecture and Mach''

Description: This is a good paper discussing one particular microkernel architecture and contrasting it with monolithic kernel design. Mach underlies Mac OS X, and its layered architecture had a significant impact on the design of the Windows NT kernel and modern microkernels like L4. In addition, its memory-mapped files feature was added to many monolithic kernels.

''An Implementation of a Log-Structured File System for UNIX''

Description: The paper was the first production-quality implementation of that idea which spawned much additional discussion of the viability and short-comings of log-structured filesystems. While "The Design and Implementation of a Log-Structured File System" was certainly the first, this one was important in bringing the research idea to a usable system.

''Soft Updates: A Solution to the Metadata Update problem in File Systems''

Description: A new way of maintaining filesystem consistency.

Programming languages

''The FORTRAN Automatic Coding System''

Description: This paper describes the design and implementation of the first FORTRAN compiler by the IBM team. Fortran is a general-purpose, procedural, imperative programming language that is especially suited to numeric computation and scientific computing.

''Recursive functions of symbolic expressions and their computation by machine, part I''

Description: This paper introduced LISP, the first functional programming language, which was used heavily in many areas of computer science, especially in AI. LISP also has powerful features for manipulating LISP programs within the language.

''ALGOL 60''

Description: Algol 60 introduced block structure.

''The next 700 programming languages''

Description: This seminal paper proposed an ideal language ISWIM, which without being ever implemented influenced the whole later development.

''Fundamental Concepts in Programming Languages''

Description:
Fundamental Concepts in Programming Languages introduced much programming language terminology still in use today, including R-values, L-values, parametric polymorphism, and ad hoc polymorphism.

''Lambda Papers''

Description: This series of papers and reports first defined the influential Scheme programming language and questioned the prevailing practices in programming language design, employing lambda calculus extensively to model programming language concepts and guide efficient implementation without sacrificing expressive power.

''Structure and Interpretation of Computer Programs''

Description: This textbook explains core computer programming concepts, and is widely considered a classic text in computer science.

''Comprehending Monads''

Description: This paper introduced monads to functional programming.

''Towards a Theory of Type Structure''

Description: This paper introduced System F and created the modern notion of Parametric polymorphism

''An axiomatic basis for computer programming''

Description: This paper introduce Hoare logic, which forms the foundation of program verification

Scientific computing

Computational linguistics

Software engineering

''Software engineering: Report of a conference sponsored by the NATO Science Committee''

Description: Conference of leading people in software field c. 1968
The paper defined the field of Software engineering

''A Description of the Model-View-Controller User Interface Paradigm in the Smalltalk-80 System''http://c2.com/cgi/wiki?ModelViewControllerHistory Model View Controller History . C2.com (2012-05-11). Retrieved on 2013-12-09.

Description: A description of the system that originated the GUI programming paradigm of Model–view–controller

''Go To Statement Considered Harmful''

Description: Don't use goto - the beginning of structured programming.

''On the criteria to be used in decomposing systems into modules''

Description: The importance of modularization and information hiding. Note that information hiding was first presented in a different paper of the same author – "Information Distributions Aspects of Design Methodology", Proceedings of IFIP Congress '71, 1971, Booklet TA-3, pp. 26–30

''Hierarchical Program Structures''

Description: The beginning of Object-oriented programming. This paper argued that programs should be decomposed to independent components with small and simple interfaces. They also argued that objects should have both data and related methods.

''A Behavioral Notion of Subtyping''

Description: Introduces Liskov substitution principle and establishes behavioral subtyping rules.

''A technique for software module specification with examples''

Description: software specification.

''Structured Design''

Description: Seminal paper on Structured Design, data flow diagram, coupling, and cohesion.

''The Emperor's Old Clothes''

Description: Illustrates the "second-system effect" and the importance of simplicity.

''The Mythical Man-Month: Essays on Software Engineering''

Description: Throwing more people at the task will not speed its completion...

''No Silver Bullet: Essence and Accidents of Software Engineering''

''The Cathedral and the Bazaar''

Description: Open source methodology.

''Design Patterns: Elements of Reusable Object Oriented Software''

Description: This book was the first to define and list design patterns in computer science.

''Statecharts: A Visual Formalism For Complex Systems''

Description: Statecharts are a visual modeling method. They are an extension of state machine that might be exponentially more efficient. Therefore, statcharts enable formal modeling of applications that were too complex before. Statecharts are part of the UML diagrams.

Security

Anonymity Systems

Topics covered: theoretical computer science, including computability theory, computational complexity theory, algorithms, algorithmic information theory, information theory and formal verification.

Academic Search Engines