Topics for New Students

Examples of specific problems that can be addressed by new students. These topics widely differ in their scope and difficulty, for basic orientation the following notation is used:

• A: warm-up exercises, few hours or days, should not be difficult to start quickly, the goal is to get acquainted with our systems/data
• B: nontrivial project, few weeks or months, exploration of new ideas / directions (typically it is necessary to learn some new techniques/tools)
• C: large project, can easily grow into master thesis (or even PhD thesis), significant planning and literature review necessary

### Data Analysis

These topics typically consist of "offline" analysis of data, our lab typically uses the Python ecosystem (pandas, scipy, matplotlib, ...).

Download one (or more) of our data sets and try:

• (A) analysis of item difficulty
• (A) analysis of wrong answers (identification of common wrong answers, patterns in wrong answers, ...)
• (B-C) clustering of similar items or users
• (A-B) visualization of student behaviour within the system (which items are solved, how successfully, ...), visualization of system behaviour (item selection, ordering, ...)
• (B) survival analysis (How long do students stay within the system (or specific type of exercise)? Can you predict when will student leave the system?)

Specific topics for particular data sets:
• (A) MatMat - commutativity of difficulty - do commutative operations have "commutative difficulty" (e.g., do 7x3 and 3x7 have the same difficulty)?
• (A-B) Umime cesky - "Doplnovacka" task - item difficulty, concept difficulty, automatic detection of "similar" items (using both text processing and data on answers)
• (A-B) MatMat - analysis of (one-digit) multiplication (difficulty of examples, wrong answers, "equivalence classes" like "1xN")
• (B-C) Tutor (logic puzzles) - application and further development of the "constraint relaxation" idea (described in Radek's Sudoku paper) to different puzzles

### Modeling Techniques, Evaluation of Models

These are typically more involved topics (B-C), which require consultation with lab members. The first step is to study some scientific papers, implementation (typically in Python) follows.

• modeling of memory
• application of "deep learning" techniques (particularly recurrent neural networks) to student modeling
• study of relations between several modeling approaches (FAST, mixture modeling), their application to our data
• evaluation of models, model comparison

### Educational System Development

Front end extensions (typically using JavaScript+libraries), A-B:

Full web development topics (C):
• technology: SQL, git, bash, latex, pandas, ...
• mathematics (see mathematics examples in Problem Solving Tutor for inspiration): graphs and functions, linear algebra (vectors, matrices, linear transformations, ...)
• chemistry (periodic table, properties of the elements, ...)
• language learning (vocabulary) "in personalized context"
2016