-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleanup multiple.md #13
Open
dmbates
wants to merge
1
commit into
main
Choose a base branch
from
db/cleanup_multiple
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Contributor
dmbates
commented
Sep 29, 2021
- one line per sentence
- do not capitalize section or subsection titles
- add a few of the plots described in the text
palday
reviewed
Sep 29, 2021
Comment on lines
+214
to
+239
Some presentations of mixed-effects models, especially those related to *multilevel modeling* [@MLwiNUser:2000] or *hierarchical linear models* [@Rauden:Bryk:2002], leave the impression that one can only define random effects with respect to factors that are nested. | ||
This is the origin of the terms "multilevel", referring to multiple, nested levels of variability, and "hierarchical", also invoking the concept of a hierarchy of levels. | ||
To be fair, both those references do describe the use of models with random effects associated with non-nested factors, but such models tend to be treated as a special case. | ||
|
||
The blurring of mixed-effects models with the concept of multiple, hierarchical levels of variation results in an unwarranted emphasis on "levels" when defining a model and leads to considerable confusion. | ||
It is perfectly legitimate to define models having random effects associated with non-nested factors. | ||
The reasons for the emphasis on defining random effects with respect to nested factors only are that such cases do occur frequently in practice and that some of the computational methods for estimating the parameters in the models can only be easily applied to nested factors. | ||
|
||
This is not the case for the methods used in the MixedModels package. | ||
Indeed there is nothing special done for models with random effects for nested factors. | ||
When random effects are associated with multiple factors exactly the same computational methods are used whether the factors form a nested sequence or are partially crossed or are completely crossed. | ||
|
||
There is, however, one aspect of nested grouping factors that we should emphasize, which is the possibility of a factor that is *implicitly nested* within another factor. | ||
Suppose, for example, that the factor was sample defined as having three levels instead of 30 with the implicit assumption that sample is nested within batch. | ||
It may seem silly to try to distinguish 30 different batches with only three levels of a factor but, unfortunately, data are frequently organized and presented like this, especially in text books. | ||
The factor cask in the data is exactly such an implicitly nested factor. | ||
If we cross-tabulate cask and batch we get the impression that the and factors are crossed, not nested. | ||
If we know that the cask should be considered as nested within the batch then we should create a new categorical variable giving the batch-cask combination, which is exactly what the sample factor is. | ||
A simple way to create such a factor is to use the interaction operator, '`&`', on the factors. | ||
It is advisable, but not necessary, to apply to the result thereby dropping unused levels of the interaction from the set of all possible levels of the factor. | ||
(An "unused level" is a combination that does not occur in the data.) | ||
A convenient code idiom is | ||
|
||
In a small data set like we can quickly detect a factor being implicitly nested within another factor and take appropriate action. | ||
In a large data set, perhaps hundreds of thousands of test scores for students in thousands of schools from hundreds of school districts, it is not always obvious if school identifiers are unique across the entire data set or just within a district. | ||
If you are not sure, the safest thing to do is to create the interaction factor, as shown above, so you can be confident that levels of the district:school interaction do indeed correspond to unique schools. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
palday
approved these changes
Sep 29, 2021
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM: feel free to squash and merge when ready
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.