\documentclass[fleqn]{article}
\usepackage[round,longnamesfirst]{natbib}
\usepackage{graphicx,keyval,a4wide,thumbpdf,makeidx,color,colordvi}
\usepackage{amsfonts,hyperref,graphicx}

\newcommand\R{\textsf{R}}
\newcommand{\pkg}[1]{{\normalfont\fontseries{b}\selectfont #1}}
\newcommand{\sQuote}[1]{`{#1}'}
\newcommand{\dQuote}[1]{``{#1}''}
\newcommand{\file}[1]{\sQuote{\textsf{#1}}}
\newcommand{\data}[1]{\texttt{#1}}
\newcommand{\var}[1]{\textit{#1}}
\newcommand{\class}[1]{\textsf{#1}}
\newcommand{\proglang}[1]{\textsf{#1}}
%% \code without `-' ligatures
\def\nohyphenation{\hyphenchar\font=-1 \aftergroup\restorehyphenation}
\def\restorehyphenation{\hyphenchar\font=`-}
{\catcode`\-=\active%
  \global\def\code{\bgroup%
    \catcode`\-=\active \let-\codedash%
    \Rd@code}}
\def\codedash{-\discretionary{}{}{}}
\def\Rd@code#1{\texttt{\nohyphenation#1}\egroup}
\newcommand{\codefun}[1]{\code{#1()}}
\newcommand{\codefunind}[1]{\codefun{#1}\index{\texttt{#1}}}
\newcommand{\codeind}[1]{\code{#1}\index{\texttt{#1}}}

\SweaveOpts{strip.white=true}

\AtBeginDocument{\setkeys{Gin}{width=0.5\textwidth}}

\definecolor{Blue}{rgb}{0,0,0.8}
\definecolor{Red}{rgb}{0.7,0,0}

% \date{2018-01-03}
\title{Knowledge Space Theory}
\author{Christina Stahl \and Cord Hockemeyer}
%\VignetteIndexEntry{KST}
%\VignetteDepends{kst}
%\VignetteKeywords{knowledge structure, knowledge space}
%\VignettePackage{kst}

\makeindex{}

\sloppy{}

\begin{document}
\maketitle

\begin{abstract}
This document explains algorithms and basic operations of knowledge 
structures and knowledge spaces available in \R{} through the \pkg{kst} 
package.
\end{abstract}

\tableofcontents
% \clearpage

<<echo=FALSE>>=
options(width = 80)
library("kst")
@ %

\section{Introduction}
\label{sec:introduction}
\noindent
Knowledge Space Theory \citep{Doignon+Falmagne:1999} is a set- and
order-theoretical 
framework, which proposes mathematical formalisms to operationalize knowledge 
structures in a particular domain. The most basic assumption of knowledge 
space theory is that every knowledge \emph{domain} can be represented in 
terms of a set of domain problems or items. Moreover, knowledge space theory 
assumes dependencies between these items in that knowledge of a given item or 
a subset of items may be a prerequisite for knowledge of another, more 
difficult or complex item. These prerequisite \emph{relations} are realized by 
surmise relations, which create a quasi-order between different items. One 
advantage of these surmise relations is that they reduce the quantity of all 
possible solution patterns to a more manageable amount of knowledge states. 
Each of these knowledge states represents the subset of items an individual 
is capable of solving. The collection of all knowledge states captures the
organization of the domain and is referred to as \emph{knowledge structure}.

\section{Classes introduced in the \pkg{kst} package}
\label{sec:classes}

The \pkg{kst} package makes heavily usage of \R{} classes. Figure~\ref{fig:classes}
gives an overview of these classes and the relationships between them.
\begin{figure}[htbp]
    \begin{center}
	\includegraphics{kst-classes.png}
	\caption{\R{} classes in the \pkg{kst} package}\label{fig:classes}
    \end{center}
\end{figure}

The basic class is the \codeind{kfamset} class which describes quite
genereally a family of sets. All other classes are effectively sub-classes
of \code{kfamset}. 

The \codeind{kstructure} class describes knowledge structures. The difference
between a \code{kfamset} and a \code{kstructure} is that the latter contains
the empty set $\emptyset$ and the full item set. A special case of 
knowledge structures are the knowldge spaces described by the \code{kspace}
class: they are closed under union. 

The fourth class defined in \pkg{kst} ist the \codeind{kbasis} class which
describes the basis of a knowledge space. Please note that a \code{kbasis}
does not contain the empty set $\emptyset$ and often neither the full
item set.

\section{Knowledge Structures}
\label{sec:kstructure}

The \codefunind{kstructure} function in package \pkg{kst} is the basic 
constructor for knowledge structures. It takes an endorelation 
representing a surmise relation or a set of sets each representing one 
knowledge state (e.g., one clause of a surmise system) and returns the 
corresponding knowledge structure:
<<kstructure>>=
# An endorelation representing a surmise relation
kst <- endorelation(graph=set(tuple(1,1), tuple(2,2), tuple(3,3),
  tuple(4,4), tuple(2,1), tuple(3,1), tuple(4,1), tuple(3,2), tuple(4,2)))
kstructure(kst)
# A set of sets representing knowledge states (e.g., clauses of a surmise system)
kst <- kstructure(set(set("a"), set("a","b"), set("a","c"), set("d","e"), 
   set("a","b","d","e"), set("a","c","d","e"), set("a","b","c","d","e")))
kst
@

Note that by default the quotes indicate the fact that the items are 
represented by characters. For displaying purposes, these quotes may be
turned off:
<<set_options>>=
sets_options("quote",FALSE)
kst
@

On the resulting knowledge structure several operations can be performed. 
Firstly, the knowledge \emph{domain} of the knowledge structure can be 
determined by means of the \codefunind{kdomain} function:
<<kdomain>>=
kdomain(kst)
@

Secondly, the \emph{notions} of the knowledge structure can be determined by 
means of the \codefunind{knotions} function. A notion is a set of items 
always jointly contained in all knowledge states. Consequently, these items 
carry the same information and may therefore be considered equivalent: 
<<knotions>>=
knotions(kst)
@

Thirdly, the \emph{atoms} of the knowledge structure can be determined by 
means of the \codefunind{katoms} function. For any item of the knowledge 
domain, an atom is a minimal knowledge state containing the respective item, 
where minimal refers to the fact that the respective knowledge state is not 
the union of any other knowledge states:
<<katoms>>=
katoms(kst, items=set("a","b","c"))
@

Forthly, the \emph{trace} of the knowledge structure can be determined by 
means of the \codefunind{ktrace} function. The trace of a knowledge structure 
on a set of items is the substructure of the knowledge structure on these 
items, i.e., the substructure resulting from restricting the knowledge 
structure to the specified item(s):
<<ktrace>>=
ktrace(kst, items=set("c","d","e"))
@

Fifthly, the \codefunind{knneighbourhood} and \codefunind{kneighbourhood}
functions compute the n-neighbourhood and neighbourhood of a knowledge state. 
<<kneighbourhood>>=
knneighbourhood(kst, state=set("a", "b"), distance=2)
kneighbourhood(kst, state=set("a", "b"))
@

Finally, the \codefunind{kfringe} function allows for determining the fringe of a
knowledge state. Fringes determine the symmetric difference between a given knowledge
state and its neighbouring states.
<<kfringe>>=
kfringe(kst, state=set("a", "b"))
@

In addition, several properties of knowledge structures may be tested. Currently, only the functions \codefunind{kstructure\_is\_wellgraded} and the \codefunind{kstructure\_is\_space} are implemented (see Section 2 for the latter). 

A knowledge structure is considered \emph{well-graded} if any two of its states are connected by a bounded path, i.e., each knowledge state (except the empty set \emph{\{\}}) has at least one predecessor state that contains the same domain items with the exception of exactly one and each knowledge state (except the state for the full set of domain problems \emph{Q}) has at least one immediate successor state that comprises the same domain items plus exactly one.   
<<kstructure_is_wellgraded>>=
kstructure_is_wellgraded(kst)
@

Apart from these basic operations, the \pkg{kst} package also provides 
plotting functionalities for knowledge structures (see README for details). The \codefunind{plot} 
method takes an arbitrary knowledge structure and plots a Hasse Diagram of 
the respective knowledge structure (see Figure~\ref{fig:hasse-kst}):
<<plot,fig=FALSE>>=
if(requireNamespace("Rgraphviz")) {Rgraphviz::plot(kst)}
@

\begin{figure}[h]
\begin{center}
<<fig=TRUE,echo=FALSE>>=
<<plot>>
@
\caption{Knowledge Structure}
\label{fig:hasse-kst}
\end{center}
\end{figure}

In order to allow for plotting the surmise relation underlying a knowledge 
structure, the \pkg{kst} package provides the \codefunind{as.relation} 
method, which computes its underlying surmise relation, i.e., the set of
item pairs corresponding to the knowledge dependencies. Antisymmetric and 
transitive surmise relations may then be plotted as a Hasse diagram:
<<as.relation>>=
as.relation(kst)
@

In those cases where individuals' response patterns are available, they may 
be used to assess individuals or validate a knowledge structure.

\subsection{Assessment and Validation}
The \codefunind{kassess} function assigns individuals to their corresponding 
knowledge state in a knowledge structure. Currently only 
\dQuote{deterministic} assessment is implemented. Assessing individuals based 
on a deterministic procedure starts by determining an item \emph{a}, which is 
contained in approximately half of the available knowledge states. If the 
individual being assessed has successfully solved the respective item 
\emph{a}, all knowledge states that do not contain item \emph{a} are removed 
from the set of potential knowledge states of the individual. If, on the 
other hand, the individual has not solved the respective item \emph{a}, all 
knowledge states that do contain item \emph{a} are removed from the set of 
potential knowledge states of the individual. From the remaining knowledge 
states an item \emph{b}, which again is contained in approximately half of 
the still available knowledge states, is selected. If the individual has 
successfully solved the respective item \emph{b}, all knowledge states that 
do not contain item \emph{b} are removed from the set of potential knowledge 
states of the individual. If, on the other hand, the individual has solved 
the respective item \emph{b}, all knowledge states that do contain item 
\emph{b} are removed from the set of potential knowledge states of the 
individual. This procedure is repeated until only one knowledge state is 
left. This is the knowledge state the individual is currently located in.
<<kassess>>=
rp <- data.frame(a=c(1,1,0,1,1,1,1,0,0,0),b=c(0,1,0,1,0,1,0,1,0,0),
   c=c(0,0,0,0,1,1,1,0,1,0),d=c(0,0,1,1,1,1,0,0,0,1), e=c(0,0,1,1,1,1,0,0,0,0))
kassess(kst, rpatterns=rp)
@

The \codefunind{kvalidate} function on the other hand calculates validity 
coefficients for prerequisite relations and knowledge structures. The 
\emph{$\gamma$-Index} \citep{Goodman+Kruskal:1972} validates the prerequisite 
relation underlying a knowledge structure and assumes that not every response 
pattern is represented by a prerequisite relation. For this purpose it 
compares the number of response patterns that are represented by a 
prerequisite relation (i.e., concordant pairs) with the number of response 
patterns that are not represented by a prerequisite relation (i.e., 
discordant pairs). Formally, the $\gamma$-Index is defined as 
\[\gamma = \frac{N_c - N_d}{N_c + N_d}\] where $N_c$ is the number of 
concordant pairs and $N_d$ the number of discordant pairs. Generally, a 
positive $\gamma$-value supports the validity of prerequisite relations.

The validation method \emph{percent} likewise validates prerequisite 
relations and assumes that more difficult or complex items are solved less 
frequently than less difficult or complex items. For this purpose it 
calculates the relative solution frequency for each of the items in the 
domain.

The \emph{Violational Coefficient} \citep{Schrepp+Held+Albert:1999} also 
validates prerequisite relations. For this purpose, the number of violations 
(i.e., the earlier mentioned discordant pairs) against a prerequisite relation 
are calculated. Formally, the VC is defined as 
\[VC = \frac{1}{n(|S| - m)} \sum_{x,y} v_{xy}\] where $n$ denotes the number 
of response vectors, $|S|$ refers to the number of pairs in the relation, 
$m$ denotes the number of items, and $v_{xy}$ again refers to the number of 
discordant pairs. Generally, a low VC supports the validity of prerequisite 
relations.

In contrast to the other three indices, the \emph{Distance Agreement 
Coefficient} \citep{Schrepp:1999} validates the resulting knowledge structure. 
For this purpose it compares the average symmetric distance between the 
knowledge structure and respone patterns (referred to as \emph{ddat}) to the 
average symmetric distance between the knowledge structure and the power set 
of response patterns (referred to as \emph{dpot}). By calculating the ratio 
of \emph{ddat} and \emph{dpot}, the DA is determined. Generally, a lower 
DA-value indicates a better fit between a knowledge structure and a set of 
response patterns.
<<kvalidate>>=
kvalidate(kst, rpatterns=rp, method="gamma")
kvalidate(kst, rpatterns=rp, method="percent")
kvalidate(kst, rpatterns=rp, method="VC")
kvalidate(kst, rpatterns=rp, method="DA")
@

\subsection{Set--related Methods}
Apart from these kst-specific functions, the \pkg{kst} package also provides 
general set-related methods. In particular, these include methods pertaining 
to the \emph{closure} and \emph{reduction} of sets.

The \codefunind{closure} method for objects of class \codeind{kstructure} 
performs the closure of a knowledge structure by computing the union or 
intersection of any two knowledge states. \codefunind{union} is also used as 
a basis for the \codefunind{kspace} function (see next section).
<<closure>>=
closure(kst, operation="union")
@

The \codefunind{reduction} method performs the reduction of a knowledge 
structure by computing the minimal subset having the same closure as the 
knowledge structure. Additionally, it allows for computing the 
\emph{discriminative} reduction of a knowledge structure. Such a 
discriminative reduction is a knowledge structure in which each notion 
contains a single item.
<<reduction>>=
reduction(kst, operation="discrimination")
@

\subsection{Families of Sets}\label{sec:famset}
A more general concept than the knowledge structures represented by
\codeind{kstructure}s are the families of (sub)sets represented by the
\codeind{kfamset} class. The \codefunind{kfamset} constructor takes the same
arguments as the \codefunind{kstructure} constructor.
<<kfamset>>=
# An endorelation representing a surmise relation
# A set of sets representing knowledge states (e.g., clauses of a surmise system)
kfs <- kfamset(set(set("a"), set("a","b"), set("a","c"), set("d","e"), 
   set("a","b","d","e"), set("a","c","d","e"), set("a","b","c","d","e")))
kfs
@
Figure~\ref{fig:hasse-kfamset} shows the Hasse diagram of this family of sets.
Please note the difference to Fig.~\ref{fig:hasse-kst} which also contains the empty set $\emptyset$.
<<plotfamset,fig=FALSE>>=
if(requireNamespace("Rgraphviz")) {Rgraphviz::plot(kfs)}
@

\begin{figure}[h]
\begin{center}
<<fig=TRUE,echo=FALSE>>=
<<plotfamset>>
@
\caption{Family of sets}
\label{fig:hasse-kfamset}
\end{center}
\end{figure}

\section{Knowledge Spaces}
\label{sec:kspace}

Apart from knowledge structures, knowledge space theory also suggests the 
concept of \emph{knowledge spaces}. A knowledge structure is considered a 
knowledge space if it includes one state for the empty set \emph{\{\}}, one 
state for the full set of domain items, and a state for the union of any two 
knowledge states (i.e., the closure under union). The basic constructor for
creating knowledge spaces is the \codefunind{kspace} function. It takes an 
arbitrary knowledge structure and returns the corresponding knowledge space:
<<kspace>>=
ksp <- kspace(kst)
ksp
@

In order to test for the space property of a knowledge structure, the 
\pkg{kst} package provides the function \codefunind{kstructure\_is\_space}:
<<kstructure_is_space>>=
kstructure_is_kspace(ksp)
@

Apart from the functions described in the previous section, which can 
likewise be performed on knowledge spaces, the package \pkg{kst} provides the 
additional function \codefunind{kbase}, which is only applicable to knowledge 
spaces. The \codefunind{kbase} function takes an arbitrary knowledge space and 
computes its \emph{base}. A base for a knowledge space is a minimal family of 
knowledge states spanning the knowledge space, i.e., the base includes the 
minimal states sufficient to reconstruct the full knowledge space. 
A knowledge structure has a base only if it is a knowledge space.
<<kbase>>=
kbase(ksp)
@

\section{Learning Paths}
\label{sec:lpath}

Both, knowledge structures and knowledge spaces, involve the concept of \emph{learning paths}. A learning path is a maximal sequence of knowledge states, which allows learners to gradually traverse a knowledge structure or space from the empty set \emph{\{\}} (or any other bottom state) to the full set of domain problems \emph{Q}. The basic constructor for creating learning paths is the \codefunind{lpath} function. It takes an arbitrary knowledge structure or space and computes all possible learning paths in the respective knowledge structure or space. The result is a list where each element represents one learning path:
<<lpath>>=
lp <- lpath(ksp)
lp
@

A learning path is considered a \emph{gradation} if each state in the respective learning path differs from its predecessor and/or successor state by a single item/notion. The \codefunind{lpath\_is\_gradation} function allows for testing the gradation property of a list of learning paths:
<<lpath_is_gradation>>=
lpath_is_gradation(lp)
@

\section{Utilities}
Other KST related \R\ packages (\pkg{DAKS} and \pkg{pks}) use matrices
as representation for knowledge structures and data sets. The
\codefunind{as.famset} function converts such matrices into families
(i.e. \code{sets}) of \code{sets}.  
<<as.famset>>=
m <- matrix(c(1, 0, 0, 1, 1, 0), nrow = 2, ncol = 3)
m
as.famset(m)
as.famset(m, as.letters = FALSE)
@
The inverse function is given by the \codefunind{as.binaryMatrix} function
which returns the matrix representation of a \code{kstructure}.
<<as.matrix>>=
as.binaryMatrix(ksp)
@

{\small
  \bibliographystyle{abbrvnat}
  \bibliography{kst}
}

\printindex{}

\end{document}