-
-
Notifications
You must be signed in to change notification settings - Fork 478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve minor inconsistencies. #142
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking this stuff over! The code changes are good, but I'm reluctant to change the original authors' language on mastery vs. knowledge given the definitions.
@@ -41,9 +41,9 @@ In educational measurement, cognitive diagnosis models (CDMs) have been used to | |||
|
|||
The *deterministic inputs, noisy "and"" gate* (DINA) model [@Junker2001] is a popular conjunctive CDM, which assumes that a respondent must have mastered all required attributes in order to correctly respond to an item on an assessment. | |||
|
|||
To estimate respondents' knowledge of attributes, we need information about which attributes are required for each item. For this, we use a Q-matrix which is an $I \times K$ matrix where $q_{ik}$=1 if item $i$ requires attribute $k$ and 0 if not. $I$ is the number of items and $K$ is the number of attributes in the assessment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seung Yeon Lee's original text says:
The presence and absence of attributes are referred to as “mastery” and “non-mastery” respectively. A respondent’s knowledge is represented by a binary vector, referred to as “attribute profile”, to indicate which attributes have been mastered or have not.
So I think either "knowledge" or "mastery" would work here (knowledge being the set of attributes mastered), so I'd be reluctant to change this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, so attributes are 'mastered' (not 'known'). The high-level concept of knowledge is characterized by (low-level) attribute mastery. I'm going by the author's terminology. It's not a big deal, it's just that using another term to mean the same thing or using metonymy (as is the case here) can create unnecessary extra cognitive load, if not confusion. And you don't want this in materials which are already difficult to digest. But, if you insist, I will, of course, revert this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also find this terminology confusing. It's doubly confusing because it's trying to overload two commonly used nouns with technical meanings. The original author of the DINA model introduced the terminology, it looks like. We're often walking the line between our general preferences for naming, etc., and that of a subfield we work with. Andrew's particularly prone to try to change entrenched terminology.
|
||
A binary latent variable $\alpha_{jk}$ indicates respondent $j$'s knowledge of attribute $k$, where $\alpha_{jk}=1$ if respondent $j$ has mastered attribute $k$ and 0 if he or she has not. Then, an underlying attribute profile of respondent $j$, $\boldsymbol{\alpha_j}$, is a binary vector of length $K$ that indicates whether or not the respondent has mastered each of the $K$ attributes. | ||
A binary latent variable $\alpha_{jk}$ indicates respondent $j$'s mastery of attribute $k$, where $\alpha_{jk}=1$ if respondent $j$ has mastered attribute $k$ and 0 if he or she has not. Then, an underlying attribute profile of respondent $j$, $\boldsymbol{\alpha_j}$, is a binary vector of length $K$ that indicates whether or not the respondent has mastered each of the $K$ attributes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[doesn't need to be fixed for this PR]
The lines should be max 80 characters if possible because otherwise we get this mess in Git with edits where we are looking at 300 character lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I completely agree. The command-line diff is also showing me that there is a trailing whitespace, so I'll fix both formatting issues at the same time.
<!-- This comment causes section to be numbered --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[not required for this PR]
Are these lines different? You can also just specify whether sections are numbered in the top-level yaml.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @bob-carpenter ! Thanks for reviewing.
I didn't mean to submit this change specifically; it's the text editor (in this case, GitHub's editor, but Vim would do the same thing) which adds a "newline at end of file" whenever it's missing.
Indeed, that's what the command-line diff looks like:
-<!-- This comment causes section to be numbered -->
\ No newline at end of file
+<!-- This comment causes section to be numbered -->
I would think it's a healthy change although very minor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, missed that---thanks!
@@ -110,9 +115,9 @@ $$ | |||
\mathrm{Pr}(\alpha_{jk}=1 \, | \, \boldsymbol{y}_j)=\sum_{c=1}^{C}\mathrm{Pr}(\boldsymbol{\alpha_j}=\boldsymbol{\alpha_c} \, | \, \boldsymbol{y}_j)\times\alpha_{ck}. | |||
$$ | |||
|
|||
Instead of conditioning on the parameters $\nu_c,s_i,g_i$ to obtain $\mathrm{Pr}(\boldsymbol{\alpha_j}=\boldsymbol{\alpha_c}|\boldsymbol{Y}_j=\boldsymbol{y}_j)$, we want to derive the posterior probabilities, averaged over the posterior distribution of the parameters. This is achieved by evaluating the expressions above for posterior draws of the parameters and averaging these over the MCMC iterations. Let the vector of all parameters be denoted $\boldsymbol{\theta}$ and let the posterior draw in iteration $s$ be denoted $\boldsymbol{\theta}^{(s)}_{.}$ Then we estimate the posterior probability, not conditioning on the parameters, as | |||
Instead of conditioning on the parameters $\nu_c,s_i,g_i$ to obtain $\mathrm{Pr}(\boldsymbol{\alpha_j}=\boldsymbol{\alpha_c}|\boldsymbol{Y}_j=\boldsymbol{y}_j)$, we want to derive the posterior probabilities, averaged over the posterior distribution of the parameters. This is achieved by evaluating the expressions above for posterior draws of the parameters and averaging these over the MCMC iterations. Let the vector of all parameters be denoted $\boldsymbol{\theta}$ and let the posterior draw in iteration $t$ be denoted $\boldsymbol{\theta}^{(t)}_{.}$ Then we estimate the posterior probability, not conditioning on the parameters, as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bob-carpenter Here, the change is only about using
Hello @education-stan !
I've been studying this example as I'm planning on drawing on it and citing it at an upcoming seminar I'll be giving (maybe I should post about it in Discourse, in the spirit of this recent comment). I have emailed author @sylee30 with a question I may have figured meanwhile...
Anyway, I'm suggesting the following edits after running the R code and considering that 'knowledge' is used in an abstract way (knowledge is represented by attribute/skill mastery).
Best,
Marianne