COMP 5408: Advanced Data Structures

assignments | office | about | topics | references | grading | cheating | special

This is the course webpage for the Winter 2014 offering of this course.
If you're struggling to find a Wikipedia article to create/contribute to, have a look at this list of missing computer science topics.
Assignment 3 is now available

About the Course

This is the home page for the graduate course Advanced Data Structures (formerly Topics in Data Structures) taught by Pat Morin in the School of Computer Science at Carleton University.

This course is about simple and easy to understand methods of data structure design and analysis that lead to efficient data structures for a variety of problems. The examples we use are selected because of their elegance and simplicity.

The course consists of three assignments, a final project, and a contribution to public knowledge. Each assignment and project has two parts, of which students need only do one. There will generally be an implementation/experimental part and a theoretical part so that students who prefer theory can do theory and students that prefer implementation can do implementations. The final project is a theory or implementation project. The contribution to public knowledge is a contribution to Wikipedia that adds information about one of the topics discussed in class or found while researching for the project.

Course Topics

Here are the course notes:

Assignments

Assignment 3 is now available as a PDF file. Those doing the implementation part will need the zip archive. It is due April 4th, or thereabouts.

Assignment 2 is now available as a PDF file. Those doing the implementation part will need the zip archive. It is due February 28 March 4.

Assignment 1 is now available as a PDF file. Those doing the implementation part will need the zip archive. It is due before/in class on Friday January 31th.

All assignments are due before class on the due date, and should be submitted by email. For time-management purposes, here are the assignment due dates:

Assignment 1Jan 31
Assignment 2Feb 21Feb 28Mar 4
Assignment 3Mar 21Apr 4
Contribution to KnowledgeApr 9
Final ProjectApril 16

Office Hours

I have office hours on Wednesday after class, 10:00-11:00. You can try my office at other times or consult my schedule and suggest a meeting time by email.

Grades

Your final grade is based on class participation, three assignments, a contribution to public knowledge, and a final project.

Class partiticipation10%
Assignment 115%
Assignment 215%
Assignment 315%
Contribution to public knowledge15%
Final Project30%
Total100%

In order to get class participation marks students are required to come to class and act as if they understand, or if they don't understand, to ask questions to improve their understanding.

Each assignment will have a theory and an implementation part. Students need only do one of these two parts to get full marks.

The final project is on a topic to be selected by the student and approved by me. Again, students may choose to do a theory project or an implementation/experimental project. In order to encourage students to do high quality research projects I am offering to pay travel, registration and accomodation expenses for any student who has their project accepted for publication in a refereed conference.

Theory Projects

Students who choose to do a theory project are encouraged to work on an open problem discussed in class or try and devise a simpler alternative for an existing data structure. The end result of this should be a paper containing a survey of related work and a summary of the new results obtained. If no new results were obtained then the paper should describe the approaches the student tried and the reasons they did not work.

If new non-trivial results are obtained the grade for the project will be 100%, otherwise the quality of the survey will determine most of the grade.

Implementation Projects

Students who choose to do an implementation project should implement several different data structures and compare their performance through rigorous experimental tests. The end result should be a paper giving a description of the data structures tested, the tests performed and the experimental results. It is acceptable to reuse implementations done for the assignments to complete the project.

Implementation papers should be of publishable quality, i.e., it should be possible to submit them to a conference and have a good chance of acceptance. To understand the quality and type of papers I am looking for students should look at recent issues of the Journal of Experimenal Algorithms, proceedings of the Workshop on Algorithm Engineering (WAE) and proceedings of the Workshop on Algorithms and Experiments (ALENEX).

Policy Regarding Unoriginal Work

Any student who is caught submitting work that they did not do themselves without crediting the original source will receive a mark of 0 (zero) on that assignment or project. Any group of two or more students handing in sufficiently similar assignments will all receive mark of zero.

And here is a message from Carleton University regarding academic integrity at the graduate level:

Plagiarism and cheating at the graduate level are viewed as being particularly serious and the sanctions imposed are accordingly severe. Students are expected to familiarize themselves with and follow the Carleton University Student Academic Integrity Policy (See http://www2.carleton.ca/graduate-studies/policies-and-guidelines). The Policy is strictly enforced and is binding on all students. Plagiarism and cheating — presenting another's ideas, arguments, words or images as your own, using unauthorized material, misrepresentation, fabricating or misrepresenting research data, unauthorized co-operation or collaboration or completing work for another student — weaken the quality of the graduate degree. Academic dishonesty in any form will not be tolerated. Students who infringe the Policy may be subject to one of several penalties including: expulsion; suspension from all studies at Carleton; suspension from full-time studies; a refusal of permission to continue or to register in a specific degree program; academic probation; or a grade of Failure in the course.

Students with Special Needs

Students with disabilities requiring academic accommodations in this course are encouraged to contact a coordinator at the Paul Menton Centre for Students with Disabilities to complete the necessary letters of accommodation. After registering with the PMC, make an appointment to meet and discuss your needs with me at least two weeks prior to the first in-class test or itv midterm exam. This is necessary in order to ensure sufficient time to make the necessary arrangements. Please note the deadline for submitting completed forms to the Paul Menton Centre for formally scheduled exam accommodations is November 7th, 2003 for fall and fall/winter term courses.

References and Course Notes

The course notes are available as PDF files the section on Course Topics. Some references relevant to the course are the following:

[A99] A. Andersson. General Balanced Trees. In Journal of Algorithms, 30(1), pp. 1–18, 1999. [pdf]
[AS96] C. R. Aragon and R. Seidel. Randomized Search Trees. In Algorithmica, Vol. 16, Number 4/5, pp. 464–497, 1996. [postscript] [pdf]
[ASST86] M. D. Atkinson, J.-R. Sack, N. Santoro, and T. Strothotte. Min-Max Heaps and Generalized Priority Queues. In Communications of the ACM, 29(10), pp. 996–1000, 1986. [postscript] [pdf]
[BGH99] J. Basch, L. J. Guibas and J. Hershberger. Data Structures for Mobile Data. Journal of Algorithms, 31(1), pp. 1–28, 1999.
[BGR96] J. Basch, L. J. Guibas and G. D. Ramkumar. Reporting Red-Blue Intersections between Two Sets of Connected Line Segments. Fourth Annual European Symposium on Algorithms (ESA'96), pp. 302–319, Springer, 25–27 September 1996. [postscript] [pdf]
[BF-C00] M. Bender and M. Farach-Colton. The LCA problem revisited. Proceedings of LATIN 2000, 2000. [postscript] [pdf]
[BS80] J. L. Bentley and J. B. Saxe. Decomposable searching problems I: Static-to-dynamic transformation. Journal of Algorithms, 1, 301--358, 1980.
[BSTW86] J. L. Bentley, D. D. Sleator, R. E. Tarjan, and V. K. Wei. A locally adaptive data compression scheme. Communications of the ACM, 29(4), pp. 320–330, 1986. [pdf]
[BSTW86] J. L. Bentley, R. Sedgewick. Fast Algorithms for Sorting and Searching Strings ACM-SIAM Symposium on Discrete Algorithms (SODA 2000), 1987. [postscript] [pdf]
[CLR90] T. H. Cormen, C. E. Leiserson and R. L. Rivest. Introduction to Algorithms. MIT Press, Cambridge MA, 1990.
[CG86a] B. Chazelle and L. J. Guibas, Fractional cascading: I. A data structuring technique. Algorithmica, 1, pp. 133–162, 1986.
[CG86b] B. Chazelle and L. J. Guibas, Fractional cascading: II. Applications. Algorithmica, 1, pp. 163–191, 1986.
[D88] L. Devroye. Applications of the theory of records in the study of random trees. Acta Informatica, 26, pp. 123–130, 1988.
[DSST89] J. R. Driscol, N. Sarnak, D. D. Sleator, and R. E. Tarjan. Making data structures persistent. Journal of Computer and System Sciences, 38(1), pp 86–124, February 1989. [pdf]
[E74] P. van Emde Boas. An O(n log log n) on-line algorithm for the insert-extract min problem. Technical Report, Department of Computer Science, Cornell University, Number TR 74–221, December 1974.
[EKZ77] P. van Emde Boas, R. Kaas and E. Zijlstra. Design and implementation of an efficient priority queue. Mathematical Systems Theory, 10, pp. 99–127, 1977.
[RG93] I. Galperin and R. Rivest. Scapegoat Trees. Proceedings of the 4th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA '93), pp. 165–174, 1993. [pdf]
[GM98] A. Gambin and A. Malinowski. Randomized meldable priority queues. Proceedings of the XXVth Seminar on Current Trends in Theory and Practice of Informatics (SOFSEM'98), pp. 344–349, 1998. [postscript] [pdf]
[I99] J. Iacono. Alternatives to splay trees with O(log n) worst-case access times. In Proceedings of the Twelfth Annual ACM-SIAM symposium on Discrete algorithms (SODA'2001), 2001. [postscript] [pdf]
[KMG09] J. Karkkainen, G. Manzini, and S. J. Puglisi. Permuted Longest-Common-Prefix Array. In Proceedings of Combinatorial Pattern Matching, volume 5577 of LNCS, Springer, pp. 181–192, 2003. [pdf]
[KS03] J. Karkkainen and P. Sanders. Simple linear work suffix array construction. In Proceedings of the 30th International Colloquium on Automata, Languages and Programming, volume 2719 of LNCS, Springer, pp. 943–955, 2003. [pdf]
[I99] C.-B. Liu. Word-Basesd RAM Priority Queues, 2001. [postscript] [pdf]
[M75] K. Mehlhorn. Nearly optimal binary search trees. Acta Informatica, 5, pp. 287–295, 1995. [pdf]
[PY90] M. S. Paterson and F. F. Yao. Efficient binary space partitions for hidden-surface removal and solid modeling. Discrete & Computational Geometry, 5, 1990.
[P90] W. Pugh. Skip Lists: A probabilistic alternative to balanced trees. In Communications of the ACM, 33(6), pp. 668–676, June 1990. [postscript] [pdf]
[P90b] W. Pugh. A skip list cookbook. CS-TR-2286.1, University of Maryland, College Park, 1990. [postscript] [pdf]
[ST86] N. Sarnak and R. E. Tarjan. Planar point location using persistent search trees. Communications of the ACM, 29, pp. 669–679, 1986. [pdf]
[S48] C. E. Shannon. A mathematical theory of communication. Bell Systems Technical Journal, 27, 379–423 and 623–656, 1948. [postscript] [pdf]
[ST85] D.D. Sleator and R.E. Tarjan. Self-adjusting binary search trees. Journal of the ACM, 32(1985), 652–686. [pdf]
[S02] M. Smid. Lowest Common Ancesotr Queries. Lecture Notes, 2002. [postscript] [pdf]
[V80] J. Vuillemin. A unifying look at data structures. Communications of the ACM, 23(4), 229–239, 1980. [pdf]
[W84] D. E. Willard. Log-logarithmic worst-case range queries are possible in space Theta(n). Information Processing Letters, 17, 81–84. 1984.