Hypertextbook Project FAQ

No replies
shaffer's picture
White BeltYellow BeltGreen BeltRed BeltBlack Belt
Joined: 2009-05-28
Points: 2019

General FAQ for Hypertextbook Project: Open Issues

Last updated on June 8, 2011.
To be updated as the project matures.


How do we start the management process to organize the process?
We can form a steering committee of people who want to work on the project. Current candidates include:
Cliff Shaffer, Tom Naps, Ari Korhonen, Nikko Laakso, Guido Roessling, Pilu Crescenzi, Ville Karavirta.

Where/how do we communicate?
Of course a big part of the answer depends on how open we want to be. Private discussions could be done via email, Skype, or Piazzza. For now, public discussions will take place at the AlgoViz.org forum set up for the project.

What is the name of the project?
We need a name. Mikko suggested "NiGHT" for "Nintendo Generation HyperTextbook". I don’t like that so much, since neither "Night" nor "Nintendo" convey good connotations. "Open" in the name is always a good idea. Several people refer to the topic as DSA (for Data Structures and Algorithms). Maybe some name built from OpenDSAHyperTextbook. And then there was VizCoSH from the 2006 ITiCSE working group (Visualization-based Computer Science Hypertextbook). OpenAlgoVizCoSH? That is quite a mouthful. Still looking… My favorite right now is OpenDSA.

What language is the text written in?
English seems the obvious choice. However, we should consider the feasibility of using "internationalization" techniques to translate it into other languages. There are two major parts to this, with sup-parts. First is the text. Can that be translated somewhat automatically? An example of a subpart is the text in any figures. The other major part is the AVs. Guido, for example, has thought about internationalization of AVs, and hopefully we can apply some of those techniques to, for example, standard controls. If Pseudocode is in a standalone markup file, perhaps it can be translated.

Where does the text come from?

Cliff Shaffer and Tom Naps have both written textbooks for which we have the rights to post online. Cliff’s book is posted online in PDF format available at http://people.cs.vt.edu/~shaffer/Book/ . It currently provides about 10-11,000 download per month.

What is the content coverage?

The project goal is to cover topics suitable for a range of "data structures and algorithms" courses, typically taught in a variety of settings from CS2 through a senior-level algorithms course, with the focus mainly on topics suitable to the sophomore or junior level.

Cliff’s book is designed for a post-CS2 course, and for senior algorithms. It covers: mathematical background; algorithm analysis; lists/stacks/queues; binary trees and their representation (BSTs, Heaps, Huffman Coding); general tree representations (including UNION/FIND); internal sorting; file processing (including buffer pools and external sorting); search algorithms (including binary search, dictionary search, hashing, self-organizing lists); indexing methods for large data (including linear indexing, 2-3 trees and B-trees); graph algorithms (representation, DFS/BFS, topological sort, shortest
paths (Dijkstra’s algorithm), Prim’s and Kruskal’s MCST); memory management (sequential fit methods, a little on garbage collection); skip lists; sparse matrix representations; various advance tree structures (AVL and splay trees, spatial data structures, tries); techniques for solving summations and recurrences; algorithmic patterns (greedy algorithms, dynamic programming); lower bounds proofs (decision trees, adversary proofs, state space proofs); math algorithm topics (exponentiation, GCD, matrix multiplication and FFT); reductions and NP-completeness; and non-computable functions.

For people who want to use this for a CS2 course, some lower-level topics might need to be added.

What programming language are coding examples written in?
Possibilities include Java, C++ (personally, my least favorite choice), pseudocode (Tom and I like that idea but are not adamant about it), Python, JavaScript. One technical advantage of Pseudocode might be that it makes it easier to create hierarchical marked-up algorithm descriptions like in AIA.

What programming language will be used to implement interactive elements?
HTML 5 seems to be the best compromise. It will run seamlessly on a wide variety of mobile platforms with no plugins, as well as all major browsers. This means that the activities are actually written in JavaScript. A project of this magnitude, and with potentially a lot of developers involved, would benefit from a program development environment, most likely in the form of a library with accompanying API. A companion document outlines our ideas on what such a library would include.

What is the hypertextbook development environment?
Cliff and Tom have submitted an NSF proposal in collaboration with the Connexions project (http://cnx.org). Connexions provides a creative commons environment for eTextbooks good support for authoring (though there are a few deficiencies with the content representation that we wish to improve on), and cloning/sharing/reusing content. The biggest downside that we see to Connexions is that they do not support HTML5. At the moment, they are reluctant to allow JavaScript on their site. However, they propose a work around of hosting the actual JavaScript code off-site (such as SourceForge)
with links from the content pages to the scripts. This would look pretty much identical to users as native HTML5.

What is the plan for defining the detailed book contents and the AVs needed?
Stage 1 is to develop a detailed Storyboard for the entire book. This can start with, for example, Cliff’s text, with all places where an activity called out (which should include all points where an algorithm’s behavior is described). Rough estimate is 100-200 activities and AVs needed. To help define what we want to see in the AVs, the Storyboard can include links to Java applets and URLs to non-Java AVs that already exist, along with descriptions for what
needs to be changed in each one. As examples of what the Storyboard version might look like, see Virginia Tech’s hashing tutorial (http://research.cs.vt.edu/AVresearch/hashing/) and TRAKLA2’s heaps tutorial (http://svg.cs.hut.fi/heaptutorial/).

As an example, I have developed a detailed storyboard for a Shellsort tutorial (there is a thread in this forum to discuss it). Ville Karavirta and I are working on an example implementation for this as well.

How can we put AVs into the storyboard?
While I mentioned above the possibilities of Java Applets and URLs in the Storyboard, this would not be at all ideal. Between the people already listed as interested in the project, we have a huge inventory of AVs. It would be nice to leverage them. JHAVE, ANIMAL, and ALVIE are all similar in that they are essentially generated from something like a scripting language. Perhaps it is possible to write XAAL translators for each of these? Ville has already written a XAAL to JavaScript outputter. So if we had the translators, then we would have a LOT of HTML5-compatible visualizations in a hurry. The only downside to this is that we might be to tempted to adopt existing AVs instead of improving them to some standard.

What is the plan for managing implementation of the AVs?
Once the Storyboard is in place, we can put out an open invitation for developers to contribute to the project. Some sort of reviewing process needs to be put in place for dealing with submissions and integrating them into the project. As the project develops, a library of utility functionality would be built. In fact, some of us have started talking about our dream AV development library.

How is assessment integrated into the project?
There are a lot of details to work out here. In fact, this seems to be the most difficult aspect, technically. The goal is for online assessment to be deeply integrated. This includes a question bank for every section, with mechanisms in place to track student performance, report this to students and their instructors, and ideally to control their progress through the material as they perform better or more poorly. There needs to be a rich assortment of exercise types that can be created. Obvious types include T/F, multiple choice, short answer, essay (which can’t be graded automatically). But there are also a lot of special types that might be of value for this project, including coding exercises (simple things can be autograded by matching outputs). Connexions is developing appropriate infrastructure, and Mikko and Ari have infrastructure as well. See http://ville.cs.utu.fi .

How will the project be funded?
Cliff and Tom have submitted an NSF proposal for $200,000 over two years. If funded, this should allow us to get the project established
to the extent of defining the infrastructure, workflow, and management processes, along with developing the Storyboard and a few sample
sections. Hopefully after that, additional NSF funding could be acquired to get users at a number of US Universities and additional AV development. The other collaborating teams could apply for national or EU-level funding to fund their individual groups to do parts of the effort. The ideal situation would be some sort of international-level funding to support an international management team in some more integrated fashion. We would need to identify programs that support such collaborative efforts between (for example) the US and EU.