Part II: Measuring Media-related Educational Competencies

Jennifer Tiede Measuring the media-related educational competencies of preservice teachers is important for the context of this dissertation because it functions as a link between the dimensions of modeling and advancing competencies: on the one hand, measurement instruments can validate and operationalize underlying models, which means concretizing and defining what a competency aspect comprises by observable and measurable behavior. On the other hand, measuring competencies is important in terms of evaluating practices of advancement, because it allows for grounded conclusions, e.g., on the success and outcomes of such practices. This evaluative perspective will be stressed below. However, competency measuring is a complex and challenging task. The following part of this dissertation will give an overview of this context, apply the findings to the models analyzed in Part I and then introduce and discuss an example of a measurement with Paper 1, “Media Pedagogy in U.S. and German Teacher Education,” concluded by further considerations.


Construct: Competence
Observation: Task & Response Sample Interpretation: Scoring, Reliability, Validity & Utility Assessment triangle (Shavelson 2010, 42). This construct is often described by a theoretical competency model. On this basis, tasks or stimuli are developed to evoke the construct, i.e. to trigger observable performance. This performance needs to be scored first, for example, by a generic rating scale or a specific rubric. To confirm that an inference on the intended construct of competence is appropriate, the results then should be analyzed regarding reliability, validity and objectivity, as commonly accepted and classic criteria for methodological rigor in a measurement: it is considered reliable if the measurement and the results can be reproduced accurately; it is considered valid if it actually measures what is intended and if the theoretical construct matches the empirical phenomenon; and it is considered objective if the results are independent from the person who conducts the testing (Przyborski and Wohlrab-Sahr 2014). This way, conclusions are drawn regarding the presence or absence of the construct and its development or status.
It is important to note that this is a simplified explanation of central relationships in accordance with the triangle described by Shavelson (2010), providing a selective and accentuated overview. Performance, for example, does not only depend on the inherent competency but is influenced by other factors as well. According to Koenig and Sesink (2012), there are implicit and explicit demands evoking performance. The output which will occur as a reaction depends on the dispositions within the person, which again depend on his or her potential and knowledge and skills and further influences. The authors describe a blank spot within the transformation of disposition into performance and point out that in this so-called "middle sphere" (p. 299) further aspects come into play, as for example, tools and conditions of learning.
Despite its simplifying design, the triangle of construct, observation and interpretation is yet a useful starting point for considering the value of competency measurement for the overall context of a multifaceted analysis of media-related educational competencies. Against this background, the three dimensions can be specified with regards to competency measurement in teacher education to clarify the methodological and conceptual requirements and challenges of this field. The research tradition of teacher professionalization offers a suitable starting point due to its systematic focus on competency measurement.
Part I of this dissertation analyzed the dimension of competence as a construct and emphasized the value of sound and valid competency models. Basing measurement instruments on such models is desirable because, according to Hartig and Klieme (2006), it is necessary to precisely define the competency in question first in order to specify the situations in which intrapersonal and interpersonal differences should occur, and in which way. Competency models offer a systematic answer to this claim. As discussed in Part I, some competency models such as M³K even offer standards which are an appropriate starting point for the definition of relevant situations.
The design of tasks depends on the overall approach and understanding of competencies and testing. In teacher education, there are two main angles from which this issue is approached, namely the perspectives of competency diagnosis and of traditional pedagogical diagnosis. Competency diagnostic approaches focus on assessing multi-faceted and complex performance dispositions. Thus, they are oriented towards complex and demanding situations and also take into account non-cognitive facets such as beliefs or motivational and volitional elements. Traditional pedagogical diagnostic approaches rather focus on clearly defined constructs which serve to describe or predict specific achievements (Schaper 2009).
Due to the complexity of the construct competence and its reference to, and dependence on, real-life situations, most measurement approaches in teacher education research focus on realistic simulations to confront participants with complex and real-life demands (Schaper 2009). In accordance with this, Shavelson (2010) suggests that tasks should "(a) be real-life in nature, (b) tap complex abilities and skills, (c) be amenable to practice and improvement, and (d) be amenable to standard setting" (p. 46).
These tasks can take different shapes and foci, depending on the overall context and purpose, and should be varied in their approach to ensure a multifaceted perspective. For example, there are items requiring objective responses and those that require subjective ones. Their application depends on the purpose; e.g., there are single choice questionnaires with one objectively correct answer and several distractors, i.e., wrong options to choose from. Such items are commonly used in the context of achievement or knowledge assessment in higher education. In contrast, subjective input is appropriate, for example, in the case of self-assessment surveys. Such surveys build on the honest assessment of participants and do not aim at right or wrong answers, as they investigate human experience, judgment, and feeling (Muckler and Seven 1992). Notably, they are quite common in the context of competency measurement. As the following chapter and the considerations on TPACK will illustrate, this may also be due to the fact that it is quite challenging for researchers to determine unambiguously what is clearly right and wrong in the context of the construct competency in the way it would be required for an objective measurement. As Schaper (2009) points out, subjective self-or peer assessments are common and necessary in research on teacher education, especially in the context of non-cognitive facets such as social-communicative or perceived self-efficacy; but they are of limited validity due to bias and judgement effects (cf. also Hartig and Jude 2007;Tousignant and DesMarchais 2002).
Further differences between measurements and instruments include the degree of standardization. According to Leutner, Hartig, and Jude (2008), educational assessments can either follow standardized testing procedures which allow for quantification and comparison, or less standardized ones, which may appear, e.g., as portfolios or biographical surveys. Standardization in this context means that the demands are identical for each individual taking the test in terms of tasks, and administration (Shavelson 2010).
Measurement instruments can further be differentiated in terms of their quantitative or qualitative orientation. Quantitative methods are about the "gathering, analysis, interpretation, and presentation of numerical information" (Teddlie and Tashakkori 2009, 5). They proceed deductively and test theses derived from theory. A typical example of quantitative measurement instruments are single choice questionnaires, which are easy to automatically analyze statistically. This makes quantitative tests particularly appropriate for large numbers of participants and for research questions that can be answered by statistical analyses. By contrast, qualitative methods focus on the "gathering, analysis, interpretation, and presentation of narrative information" (ibid., p. 6). Their nature is inductive, i.e., they generate theories based on the generalization of single cases. A typical example of qualitative measurement instruments are guided interviews, which can be recorded and analyzed, e.g., by a qualitative content analysis (Mayring 2015). Naturally, such a qualitative or thematic analysis of narrative material is more difficult to apply with large numbers of participants. There are also mixed methods approaches that combine quantitative and qualitative methods in an instrument or study (Teddlie and Tashakkori 2009).
The third point of the triangle introduced by Shavelson (2010) refers to the assessment and interpretation of data collected. Against the background of different measurement and item types pointed out above, Schaper (2009) notes that assessment and interpretation are again dependent on the overall testing approach: traditional pedagogical-psychological tests are usually evaluated against social criteria based on reference populations, while for competency measurement it is preferable to apply a criterion-based assessment to be able to interpret the competency levels achieved individually.
Overall, these perspectives illustrate the width and possibilities of competency measurement. However, there is also criticism expressed repeatedly towards practices of competency measurement, especially in the German context with its rich theoretical discourse on competency modeling and measuring. As Schaper (2009) points out, current practices neglect the development of competency level models and of competency development models. Furthermore, integrative strategies of competency modeling which successfully link empirically-inductive and normatively-deductive approaches are scarce. Also, the validation of measurement instruments is improvable. Moreover, according to Trültzsch-Wijnen (2016), current practices often neglect a consistent differentiation between the measurable performance and Kompetenz as a non-measurable construct.
Beyond such objections from a methodological viewpoint, there are also more general points of criticism related to the measurement of competencies emanating from the conceptual and terminological discourse about Medienkompetenz [media competence] and Medienbildung [media education]. As argued in Chapter 2, Tulodziecki (2010; points out that this discourse stems from the diverging and sometimes overlapping and inconsistent use of the two concepts. The author suggests referring to Medienkompetenz as an objective which describes a level for media pedagogical actions, while Medienbildung is the process in which Medienkompetenz can be advanced. Understanding Medienkompetenz as a target perspective which can be differentiated in levels consequently allows for and even demands competency measurement. However, this approach and the respective quantification and measurability of competencies have also been challenged repeatedly and are not agreed upon in current discourses (Schorb 2009;Hugger 2006;Schaumburg and Hacke 2010;Moser 2010).
Overall, as these examples of potential shapes and foci of measurement instruments show, the conceptual width of media-related educational competencies is reflected also in varying forms of operationalization in measurement instruments. Hence, it can be expected that the measurement instruments for the three models introduced in Part I, which are either developed, in development, or might be developed in future, assume different forms and are also heterogeneous. Against the background of this brief introduction, selected measurement instruments will be introduced in the following chapter to complement the previous model comparison with another important facet.

7.
Competency Measurement Instruments Based on DigCompEdu, TPACK, and the M³K Model In Part I, DigCompEdu, TPACK, and M³K and were introduced as three examples of models of media-related educational competencies from three different contexts, and their design and features were analyzed and compared. Against the background of the relevance of competency measurements as argued above, it is now consistent to amend the model analyses by a comparative perspective on competency measurement with DigCompEdu, TPACK, and M³K. Hence, an overview will be provided for each of the three models, including central measurement instruments which were developed to operationalize the three models. The selection of instruments included in this chapter follows the restricted availability with regards to DigCompEdu and M³K: it was pointed out earlier that both of these models are rather recent and so limited in their impact and related research so far, which also applies to their respective measurement instruments. By contrast, TPACK has been described to be received and operationalized extensively in multiple contexts. Against this background, instruments available will be introduced first and then discussed regarding their availability, their design and contents, and regarding their operationalization of the model at the foundation. This way, further important facets will be added to the analyses of the three models.

Measurement Instruments for DigCompEdu
In the case of the European DigCompEdu model, the "DigCompEdu Check-In" self-assessment tool is a freely available online tool. Its German version has been empirically confirmed to be reliable and valid (Ghomi and Redecker 2019). There are different versions of the Check-In tool for teachers in primary, secondary, and vocational education and training, for academics teaching in higher or further education, and for lecturers in adult education or continuous professional development. These tools are available in several languages. Their overall goal is to help educators of all kinds reflect and advance their digital competence. Participants taking this test have to answer 22 self-assessed items, each of them comprising five answer options, and receive feedback on the status of their competencies, as well as suggestions and milestones for further development (EU Science Hub 2019; Ghomi and Redecker 2019).
The tool claims a close relationship to the DigCompEdu model because the competency areas and aspects from the model are used as categories and items within the tool (EU Science Hub 2019). To illustrate the relationship between model and self-assessment tool, Table 1 lists one example of a competency aspect with its proficiency statements as formulated in the model and contrasts it with the respective self-assessment item from the tool. My students do not work in groups A2: When implementing collaborative activities or projects, I encourage learners to use digital technologies to support their work, e.g. for internet search or to present their results.
It is not possible for me to integrate digital technologies into group work B1: I design and implement collaborative activities, in which digital technologies are used by learners for their collaborative knowledge generation, e.g. for sourcing and exchanging information. I require learners to document their collaborative efforts using digital technologies, e.g. digital presentations, videos, blog posts. I encourage students working in groups to search for information online or to present their results in digital format B2: I set up collaborative activities in a digital environment, e.g. blogs, wikis, moodle, virtual learning environments. I monitor and guide learners' collaborative interaction in digital environments. I use digital technologies to enable learners to share insights with others and receive peer-feedback, also on individual assignments.
I require students working in teams to use the internet to find information and present their results in a digital format C1: I design and manage diverse collaborative learning activities, where learners use a variety of technologies to collaboratively conduct research, document findings and reflect on their learning, both in physical and in virtual learning environments. I use digital technologies for peer-assessment and as a support for collaborative self-regulation and peer-learning.
My students exchange evidence and jointly create knowledge in a collaborative online space C2: I use digital technologies to invent new formats for collaborative learning. This contrasting juxtaposition of the original DigCompEdu competency aspect and proficiency statements and the item from the Check-In tool with its answer options reveals two striking conclusions. First, the model format was operationalized for the instrument: the abstract competency heading was turned into an activity statement, and the structure of the proficiency scale which is marked by steps from A1 to C2 is not explicitly mentioned in the tool. Second, the extent was clearly reduced. The DigCompEdu model describes each competency aspect in six proficiency stages, while the DigCompEdu Check-In tool uses five levels, an adoption that was based on different implementation stages supposedly prevalent among current teachers. Hence, the five levels are based on the following six stages: "no use -basic use -diversification -meaningful use -systematic use -innovation" (Ghomi and Redecker 2019, 453).
The example in Table 1 illustrates the discrepancy between model and tool clearly. On the higher proficiency levels, the competencies described in the original model are quite complex and refer to the educator's ability to design and manage complex learning activities and learning environments and to create and innovate. The verbs in the highest two proficiency levels include "design," "manage," "use," and "invent." In the Check-In tool, on the contrary, collaborative learning is understood in a narrower sense and the items refer to less complex actions.
It is remarkable in this context that not all of the Check-In tool-items directly relate to the educator's competencies: the highest competency level is described as "My students exchange evidence and jointly create knowledge in a collaborative online space," which emphasizes the effect of the educator's competency on the learners instead of the competency itself. This effect, namely the learners' reactions and resulting actions, is obviously impacted by numerous further influences that come into play beyond the educator's competencies. To have students exchange evidence and create knowledge together in a collaborative online space, it is also necessary, for example, to have the appropriate technical equipment at hand, to work with a motivated and productive group of learners, or to have the curricular framework for integrating respective tasks. In other words, this means that an educator who would generally be capable of demonstrating the competency in question might not select this option in the Check-In tool, not because of a lack in his or her competencies but because other frame conditions are responsible for his or her students not exchanging evidence and creating knowledge together in a collaborative online space. Against this background, it appears questionable if a focus on the effects of competencies, as suggested by the Check-In tool in some places, is an appropriate means for achieving a realistic self-assessment of educators' competencies. This problem is further enhanced by the changing format between the different answer options: option one, referring to the lowest level of competency, starts with "my students do not…," while the third option inconsistently begins with "I encourage…." This overall inconsistency adds to the impression that the tool shows a tendency to both simplify and concretize matters even at the expense of conceptual precision.

Measurement Instruments for TPACK
In the course of the multifold and exhaustive scientific reception of TPACK, many instruments were developed. The first instrument that received greater recognition and was used as a basis for most of the succeeding instruments was published by Schmidt et al. (2009) as a self-assessment scale. Further forms of TPACK-based measurement instruments include open-ended questionnaires (So and Kim 2009), performance assessments (Graham, Tripp, and Wentworth 2009), Interviews (Ozgun-Koca 2009), or observations (Suharwoto 2006;Koehler et al. 2014). Only recently, researchers began to measure knowledge by objective knowledge tests in subareas of TPACK as well (Drummond and Sweeney 2017). Apart from that, numerous developments of TPACK and adaptations to fit specific contexts brought along their own models and, consequently, measurement instruments. For example, Saengbanchong, Wiratchai, and Bowarnkitiwong (2014) suggest TPACK-S (Technological Pedagogical Content Knowledge appropriate for instructing students), thus adding the perspective of the students to the TPACK model. Bachy (2014) amends the classical model to inform TPDK (techno-pedagogical disciplinary knowledge), an interplay of pedagogical knowledge, technological knowledge, discipline (PCK) and personal epistemology. Sang et al. (2016) introduce CTPACK (Chinese preservice teachers' technological pedagogical content knowledge), which specifies the classic TPACK model for a certain cultural background. Benton-Borghi (2015) develops UDL infused TPACK (Universally Designed for Learning Infused Technological Pedagogical Content Knowledge), merging the idea of TPACK with a second educational framework.
As this brief overview demonstrates, the width and heterogeneity of approaches to measuring TPACK makes it difficult to abstract a core procedure or common approach. To provide a selective overview of different approaches, Table 2 presents a selection of items used in different self-assessment instruments to measure a comparable aspect within the same dimension of TPACK. The central dimension of Technological Pedagogical Content Knowledge serves as an example for illustrating ways in which different self-assessment surveys aim to capture the same knowledge. For reasons of comparability, all surveys included in Table 2 are self-assessment instruments. This overview of examples shows a certain variety between different approaches even with regards to a comparable aspect. The scales reach from 5-point to 7-point, the perspectives include teachers' perspectives on their own TPACK and students' perspectives on their teacher's TPACK, and the items focus on different emphases. While all of them incorporate the idea of a combined knowledge of technology, pedagogy and content, they formulate quite diversely. An important notion in this context is the fact that, opposed to the idea of TPACK as a model of knowledge domains, all of the five examples aim at a competency instead of a knowledge domain; related verbs are "can teach," "represent content […] via the use," "ability to integrate," "can combine," "evaluate," and "can use." In Önal's (2016) instrument, the scale even ranges from "incompetent" to "completely competent" and thus verbalizes the competency orientation of these measurement instruments. On the one hand, this observation supports the formerly expressed presumption that TPACK is an appropriate reference for analyzing models of media-related educational competencies despite its focus on knowledge. On the other hand, the blurring of knowledge and competence as represented by these surveys adds to the conceptual imprecision with which central sources have been revealed to operate.

Examples of items
With regards to the differences between the five examples of approaches, parts of them might be explained by the fact that TPACK, as a structural competency model, is quite abstract. The descriptions of each knowledge area do not indicate specific competencies in the way DigCompEdu or M³K do by their standards. Hence, the development of items for measuring these knowledge fields is very open, as the precise scope of each knowledge area needs to be determined first and leaves considerable room for interpretation. This condition has been realized to be problematic, as summarized by Cox and Graham (2009): "Thus far, the explanations of technological pedagogical content knowledge and its associated constructs that have been provided are not clear enough for researchers to agree on what is and is not an example of each construct. Mishra and Koehler and others have provided definitions of TCK, TPK, and TPACK that articulate to some degree the centers of these constructs, however, the boundaries between them are still quite fuzzy, thus making it difficult to categorize borderline cases." (p. 60) Consequently, it has been pointed out repeatedly that the psychometric features of the model and the nature of pedagogical knowledge remain challenging and problematic (Chai et al. 2011;Archambault and Barnett 2010;Valtonen et al. 2015).

7.3
Measurement Instruments for M³K In the case of M³K no finalized and validated instrument has been published so far, but there is a measurement instrument which was developed in the course of the project and used in several pilot studies despite its non-finalized validation (Herzig et al. 2016;Tiede and Grafe 2016;cf. Chapter 8). Since this instrument is the only measurement operationalization of the M³K model available, it will be included in the following.
It has been described that the M³K model has been designed also as a contribution to the systematic improvement of teacher education programs (Grafe and Breiter 2014). Consequentially, the M³K measurement instrument is a standardized and quantitative instrument with a focus on items that are to measure competencies objectively. This design allows for the collection of easily comparable data in large numbers and, if applied in a representative sample, is theoretically suitable for drawing generalizable conclusions on the target group of German preservice teachers.
Hence, the measurement can help inform policy recommendations and, eventually, foster respective improvements.
In accordance with the structure of the M³K model of medienpädagogische Kompetenz, there are three main fields in the survey referring to "teaching with media," "teaching about media," and "media-related school reform." Furthermore, "technological knowledge" is added in the sense of a correlate to account for the important role this factor plays for the development of medienpädagogische Kompetenz (Herzig and Martin 2018; cf. Chapter 4.3). Also, there are objective items to determine the participants' competencies and additional self-assessments for media-related beliefs for each field (cf. Chapter 8.1.1). Table 3 illustrates how competency aspects have been operationalized to items for the measurement instrument. It shows two examples of M³K competency aspects and the corresponding item from the measurement instrument. The competency standards from the model are also included because these standards fulfill the conditions for developing empirical measurements as defined by Hartig and Klieme (2006): they precisely specify the competencies in question, they indicate relevant situations and define competent acting. Many students have a television set in their rooms and often watch early-evening series, films and other television entertainment programs.

Do out-of-school television viewing habits influence the way in which students learn with video films in the classroom?
Out-of-school television viewing habits… • influence learning with videos because students come to see the videos as easy media, which causes the effectiveness of their learning to suffer. • influence learning with videos because students are familiar with movies and can therefore learn better with videos than with written texts. • do not influence learning with videos because classroom learning with videos requires skills other than those used when watching TV for entertainment purposes. • do not influence learning with videos because students are aware that classroom videos are for learning purposes while home TV is mainly about entertainment.

Area: Medienerziehung/teaching about media
Aspect: Understanding and Assessing Conditions for Media Education Activities The preservice teachers are able to describe the relevance of use of media outside of school for socialization, education and learning with the aid of examples by reference to theoretical approaches and empirical results.
In media effects research, there are numerous academic studies in the area of "media and violence." Many findings have signified that the way in which violent content is presented has an influence on whether the consumption of that content promotes aggressive behavior.

Which statement about the effect of media violence on children is the most accurate?
Consuming violent media content is more likely to trigger aggressive behavior in children… • if the violent main character is punished for his/her behavior. • if the act of violence is carried out by a main character with a high degree of identification potential. • if negative effects for the victim of violence are explicitly presented. • if the violence is presented as unjustified.

Tab. 3.: Comparison of selected M³K competencies and survey items.
These examples show how situations were formulated for the test items in order to operationalize the competencies described in the M³K model and specified by the standards. The first example relates to competency A1.1 from the field of "teaching with media," aspect "Understanding and Assessing Conditions for Media Education Activities," which is about the relevance of extra-school media use for learning contexts. The corresponding item realizes this requirement through a specific scenario of private TV consumption and its effect on learning. Hence, it can be concluded, as in the case of DigCompEdu (cf. Chapter 7.1), that the item further concretizes the competency as defined by the standard. The same applies to the second example in Table 3 where the relevance of extra-school media use for socialization, education and learning is specified by a scenario on the effects of media violence on children.
The two typical items from the measurement instruments both start by a short scenario and thus employ the methodology of situational judgment. For this method, standardized situations or hypothetical scenarios are presented that require participants to analyze the situation and to develop appropriate behavior for solving the problem and to apply their knowledge in an appropriate way, depending on the situation. The competency in question is then inferred from the hypothetical actions participants chose (Seifert and Schaper 2012;Weekley and Polyhart 2006). This characteristic in the item construction points to a central issue in the objective measurement of competencies: as argued in Chapter 2, a competency comprises more than just knowledge, and it is therefore critical to attempt to measure a competency by a survey which ultimately requires knowledge to answer. The multiple-choice format makes the survey easy to upscale and enhances objectivity and comparability. However, at the same time it inevitably neglects important facets contributing to competence beyond knowledge, such as the situational reference and contextualization of competencies (Sampson and Fytros 2008), e.g., in terms of motivational, volitional and social willingness within a given situation (Weinert 2001). Against this background, the situational judgment format was applied to encourage participants to come to decisions with a certain reference to real-life situations and to activate a skillset and abilities beyond mere knowledge. Yet the format of objective measurement and standardized multiple-choice items remains challenging in the light of these characteristics of the construct of competency. The validation of the M³K measurement, which could not yet be completed with satisfactory values, adds to the impression of measurement challenges in this specific case, although there are multiple reasons potentially responsible in this context. Scarce learning opportunities in relevant fields for preservice teachers in the sample, for example, are assumed to have contributed to a weak internal consistency (Herzig et al. 2016). Eventually, the ongoing measurement instrument validation could be enriched by a triangulation of methods, e.g., by applying qualitative approaches with smaller samples to observe respective behavior in realistic scenarios directly and thus address the methodological challenges in assessing complex competencies with a standardized multiple-choice test.

Conclusions from the Analysis of DigCompEdu, TPACK, and M³K Measurement Instruments
With regards to the proximity of model and tool, the structure of DigCompEdu as a competency level model brings about a straightforward starting point for the development of a measurement instrument because of its precise proficiency descriptors for self-assessment. Hence, the development of items is less ambiguous and open, compared to the cases of TPACK and M³K, because the basis for self-assessment items is already provided. Instead, challenges in developing a DigCompEdu instrument include aspects like validation, i.e., the fit of statements and model will have to be confirmed for all language versions, as is conducted for the German instrument already (Ghomi and Redecker 2019), and feasibility, i.e., the instrument will have to be relevant for a considerably wide target group and applicable in terms of extent. Like M³K, DigCompEdu qualifies as a basis for a standardized quantitative instrument, which could help shape policy recommendations and have an impact on a European level. With its proficiency descriptors, it also suggests a self-assessment instrument to fulfill the purpose of fostering educators' individual professional development.
If the varieties and options for measurement instruments of DigCompEdu, TPACK, and M³K are contextualized and contrasted, TPACK stands out due to the considerable width and variety of instruments, which correspond to the popularity of this model. The relationship between model and measurement is noteworthy in this case. It was described in Part I that TPACK is a structural competency model, and for structural competency models related literature suggests measurements which explore and validate the nature and relationship of its dimensions (Hartig and Klieme 2006). This claim was met repeatedly (Archambault and Barnett 2010;Schmidt et al. 2009;Shinas et al. 2013). However, TPACK measurements have also been used for a number of further purposes, such as competency assessment and objective measurement, which leads to the conclusion that the shape, purpose and design of competency measurement instruments depend on a number of factors which may include, but are not limited to, the type and purpose of its respective competency model. The case of TPACK illustrates that a structural competency model can also serve, for example, as a basis for standardized measurement instruments which seek to objectively quantify and ultimately compare participants' competencies (Drummond and Sweeney 2017). However, based on the preceding analyses, on the comparison with competency level models and also on the criticism expressed repeatedly towards TPACK measurements (Brantley-Dias and Ertmer 2013;Cavanagh and Koehler 2013;DeSantis 2016;Graham 2011), it becomes clear that such measurement approaches face specific challenges and require respective additional considerations in terms of validation and item-model fit to make sure that this gap is bridged appropriately.
Out of the three measurement instruments introduced, the M³K measurement is the only instrument assessing competencies with an objective test with right and wrong answers. The approach aims at reliable data that are less influenced by factors such as subjectivity, unrealistic assessments of one's own abilities, or social desirability. Yet the issues in validation mentioned above substantiate the conclusion that each approach and method to measuring media-related educational competencies brings about specific advantages and challenges that need to be balanced.
Overall, the overview of measurement instruments of DigCompEdu, TPACK and M³K supports the conclusions drawn in Part I as regards significant differences between the three, which help fulfill the requirements of their unique contexts. Also with respect to measurement instruments, it is essential to consider the context and objective of the measurement application. For example, standardized and knowledge tests facilitate an objective comparison of participants and are easier to upscale for larger pilots, while measurements like systematic observations or interviews offer rich insight into the development and knowledge of individuals but are often less applicable for large-scale assessments. Furthermore, evaluating and assessing preservice teachers' competencies may have different implications and requirements than according considerations in the context of inservice teachers: both target groups have different prior knowledge, professional experience and requirements in terms of competencies and knowledge. As argued above, a competency measurement with German preservice teachers will necessarily have to focus on academic and theory-based knowledge instead of practical skills and competencies, while respective measurements with US preservice teachers might reveal a greater practical focus, given systematic differences in systems of teacher education. Hence, the suitability of models and instruments also varies, depending on factors such as cultural background or main purpose of the model implementation. Against this background, it seems beneficial that there are specific competency models and measurement instruments for various purposes and contexts. It has also become clear that well-grounded international models like DigCompEdu, building on various frameworks and summarizing a broad background, are valuable and applicable in many contexts but are not universally applicable or a replacement for the national and smaller-scale models they build on. The analysis of the connection between the theory-focused first phase of German teacher education and respective measurements putting less emphasis on practical skills of application offers an example for a context in which the applicability of an otherwise widely useable framework and measurement like DigCompEdu is limited.
This chapter provided an overview of different ways of measuring competencies, and the importance of considering the overall background and the underlying competency model was emphasized. It is now consequent to amend these theory-based conclusions through an analysis of the practical application of such a competency measurement in an international comparative context. Hence, Paper 1 will be introduced in the following chapter describing a quantitative measurement of media-related educational competencies. The results of a main pilot in Germany will be compared with the results of a smaller exploratory study with preservice teachers in the USA, both of which were conducted in the context of the M³K research project. Hence, the M³K model and measurement instrument will be used as a basis for this measurement and comparison in the following.

Measurement Instrument
The comparative study of media-related educational competencies of preservice teachers in Germany and the USA introduced in the following is based on the German M³K measurement instrument developed in the course of the M³K project (Herzig and Martin 2018;cf. Chapter 7.3). This development process included an iterative design circle with two major pilots and several smaller interim pilots with German preservice teachers and continuous processes of revision and improvement. There were three versions of test books with rotating contents to account for the length of the instrument. As mentioned above, the validation of the resulting instrument used in the following was not finalized during the project runtime. These German main pilots were amended by an exploratory study with US preservice teachers to consider and explore the international applicability and connectivity of the instrument. Table 4 lists the items included in the original German instrument in its final version and the items used in the US exploratory study. As Table 4 reveals, all items were used both in the German main pilot and in the US exploratory study except for the items in relation to the field of media-related school reform, i.e., 10 objective items, 6 self-assessment items on beliefs and 6 self-assessment items on self-efficacy. These items were omitted in the US exploratory study for reasons of extent and cultural fit. In terms of extent, the smaller sample (n USA = 109; n GER = 914) rendered a rotation of items in different test books impossible. As argued in Paper 1, the smaller size of this sample is due to the exploratory character of the comparative study which serves to amend and add a further perspective to the main study in Germany. With regards to the cultural fit, the previous chapters contributed from different angles to the conclusion that the national and cultural background is of fundamental importance for the competency measurements. While this fact necessitates a careful translation of the whole M³K instrument, it is critical especially in the context of media-related school reform, because this field is particularly dependent on the educational system and the role of educators in this system. Since it was considered insufficient to compensate for the significant differences between Germany and the USA in this context by minor semantic changes or adoptions, it was decided to leave out this field in favor of a targeted exploration of the other two fields of teaching with media and teaching about media.

Survey Translation Methodology
For the application of the German M³K instrument in the USA, an extensive adaptation process was necessary to ensure that US American participants shared the same conditions as the German participants when taking part in the study. As argued in Chapter 3.2.1, it is generally considered challenging to ensure test validity in international comparisons. Most centrally, two elements have to be considered when adapting an instrument for cross-national application, namely language and cultural fit. To respond to these claims, a five-step team translation approach was developed which mainly builds upon the Guidelines for Best Practice in Cross-Cultural Surveys (Survey Research Center 2010) and Harkness (2008).

First step: Translation
In the beginning, three independent translations were drafted. Two of these were prepared by two independent teams of translators fulfilling the following characteristics: (1) professional and experienced translators, (2) US American native speakers, and (3) familiar with the field of educational research. These two teams each worked in two steps: first, one translator produced a translation, and second, this translation was reviewed, critically evaluated and, if necessary, improved by the second translator. This way, two different professional and peer-reviewed translations were drafted.
A third translation was produced by the author of this dissertation as a member of the M³K project team fulfilling the following criteria: (1) experienced with German-English translations, (2) German native speaker, and (3) content expert because involved in the process of test instrument design. Part II: Measuring Media-related Educational Competencies Second step: Review The first review process was conducted by the author of this work and a second member of the M³K team, who fulfilled the role of survey experts as they were experienced with translations and acquainted with the project including questionnaire, survey and background. In this review process, the two professional translations were compared thoroughly and combined in order to find the best translation. It was observed that one translation tended to be more colloquial while the other one was more formal. This difference turned out to be helpful in the process as there was a certain variety to build upon. Eventually, the more formal version was preferred in a majority of cases.
The professional translators were involved in this step and contacted if the comparison raised questions on translation matters. Additionally, the advance translation created by one of the survey experts in the first step was consulted in case of doubt. There were a few cases when both professional translations seemed inappropriate and the advance translation offered a new idea. As a result of this review, a preliminary translation was developed which served as a basis for the following stages of evaluation and improvement.
Third step: Adjudication I In this step, decisions were made on issues which had been identified as controversial before. After the review process, which had focused on linguistic and translation matters, the first adjudication served to raise questions about content and cultural fit. An external expert was consulted for this purpose. Being of German origin, having lived in the US for several years and working as a media education professor in the US, she added a valuable point of view and helped improve the review version of the test instrument.

Fourth step: Pretestings
The translated version of the test instrument which resulted from the preceding steps was now tested in three pretestings in order to ensure its cognitive validity. At first, an elaborate cognitive pretesting was conducted with a US expert in media education. Karabenick et al. (2007) suggest such a cognitive pretesting as a means of adapting an established instrument to a new purpose, population, setting or language; this way, it is possible to identify how new populations could interpret items differently, which is helpful for informing efficient adaptations.
The cognitive pretesting was structured by help of the following four questions for each item: 1. Please read this question out loud, 2. What is this question trying to find out from you?, 3. Which answer would you choose as the right answer for you?, 4. Can you explain to me why you chose that answer?
This procedure was intended to identify problems at all three critical cognitive information-processing steps respondents are to complete successfully: interpretation of item meaning, recalling memories that are relevant for this item, and choosing an answer which correctly reflects these memories (Karabenick et al. 2007). In the process, it was discovered that the better part of items required further improvements, primarily on a semantic level. Based on the expert's comprehensive feedback and suggestions, a number of changes were accepted.
The current version of the questionnaire was now transformed into an online survey and filled in by a first exploratory pretesting sample of n = 2 US preservice teachers in order to support the validity of the latest version and to rule out potential remaining mistakes or problems. The participants' feedback did not indicate a need for further changes.
Fifth step: Adjudication II Finally, the translation was discussed and reviewed by the internal team of survey experts once more. Changes that had been made were reconsidered, and the adapted version was accepted as appropriate for the upcoming explorative international survey. At that point, it was not perceived necessary to involve further experts, as the first small test survey had not indicated a need for further editing.
Overall, the translation and adoption procedure was accepted as appropriate for the given purpose. Disadvantages of the approach include high costs in terms of human and financial resources because of the involvement of professional translators, several staff members, experts and pretesting participants. However, the resulting translation can be considered reliable and valid. Thus, from a methodological viewpoint it is to be preferred to less complex translation approaches such as team translations with fewer experts or non-professional translators, relying on one single translation draft, or back translations (Harkness 2008).
The US version of the questionnaire was administered in two phases in 2015 with n = 109 participants: first, n = 70 preservice teachers of Wheelock College, Boston, filled in a pen and paper version, and in a second step, n = 39 preservice teachers from five US institutions of initial teacher education (University of Chicago, James Madison University in Harrisonburg, Rhode Island College, Ohio University and Appalachian State University) completed the online version. The sample includes preservice teachers at undergraduate and graduate levels and from all kinds of school forms (i.e., primary and secondary). The results of the survey and its comparison to the German data collected in the course of the M³K project will be introduced in Paper 1.

Introduction
The relevance of pedagogical media competencies in teacher education Given the omnipresence of media like TV, internet and mobile phones and their wide influence on the daily lives of young people (MPFS 2014; Lenhart 2015; EU Kids Online 2014), the relevance of these so-called "new media" for school and teaching has developed and increased over the last decades as well. On the one hand, they can be utilized as an appropriate means to support successful learning processes and to facilitate effective teaching; on the other hand, they have become a subject themselves since students need to learn about media education issues, like responsible behavior in online environments or ethical aspects of internet use, at school (KMK 2012;ISTE 2008). Hence, scholars and practitioners all over the world agree that teachers need specific knowledge and skills in order to integrate new media into their lessons successfully. While most works of research have focused on teachers' and preservice teachers' own media literacy skills or technological knowledge (Fry and Seely 2011;Oh and French 2004), further competencies are required for a professional inclusion of media into school. Teaching with media and teaching about media / media education are generally considered the two core areas in this context. However, there are varying concepts of the specific competencies and skills, which will be summarized under the term "pedagogical media competencies" here. A well-known and established framework for defining these competencies in question was developed in the USA by Mishra and Koehler (2006) as TPACK (Technological Pedagogical Content Knowledge), which is based on Shulman's work (1986). Shulman defined pedagogical content knowledge, content knowledge, and pedagogical knowledge as the core areas of competencies that teachers should be skilled in. Mishra and Koehler (2006) added the aspects of technological knowledge, technological content knowledge, technological pedagogical knowledge and technological pedagogical content knowledge and thus developed a comprehensive model of the skills needed to teach with media successfully.
Despite the existence of frameworks like TPACK, there is no common consensus about the precise shape of pedagogical media competencies, neither worldwide nor even within countries. Furthermore, their integration into university teacher education is also subject to discourse and has not been realized consistently, even though teacher training has been acknowledged to be a suitable and mandatory place for the acquirement of media pedagogical skills (Blömeke 2003). Hence, there are no binding curricula yet which could ensure a basic media pedagogical education for every preservice teacher, but there are non-binding standards and guidelines that make suggestions for such processes, as for example the UNESCO Media and Information Literacy Curriculum for Teachers (Wilson, Grizzle, Tuazon, Akyempong, and Cheung 2011).
This inhomogeneous situation, where efforts and ways to integrate media pedagogy into teacher education can be assumed to vary between countries and institutions, forms the background of this paper. This exploratory study aims to further explore the pedagogical media competencies of preservice teachers in Germany and the USA. Comparing two countries serves to overcome cultural boundaries, to countervail the danger of a narrowed perspective and to benefit from the background, research and knowledge of different viewpoints. Both countries share a rich culture of pedagogical discourse and research on teacher education, which provides a common background to build upon (Grafe 2011). Both countries share generally similar approaches to educational policy and structure, as strong state and local control of education is paired with high levels of federal influence on educational issues (Blömeke and Paine 2008;Tiede, Grafe, and Hobbs 2015). In the following, different models of pedagogical media competencies from both countries will be introduced and the extent to which these competencies have become part of teacher education programs and related studies will be summarized. Afterwards, methods and selected results of a study will be described where the skills in question were measured with students from both countries, based on a comprehensive model of pedagogical media competencies that connects German and international research in this field. The international comparative perspective will help broaden the viewpoint and understand similarities and differences. These data serve to identify different ways of integrating media pedagogy into teacher training and point to conclusions about the consequences these processes entail for preservice teachers and their pedagogical media competencies. Part II: Measuring Media-related Educational Competencies

Pedagogical media competencies in German and U.S. teacher education
The issue of teacher competencies is a key factor in advancing the future of education both in the United States and in Germany (see for a detailed overview of the development and current state of media education in both countries for example Tulodziecki and Grafe 2012;Hobbs 2010;Tiede, Grafe, and Hobbs 2015).
The Standing Conference of the Ministers of Education and Cultural Affairs of the Länder in the Federal Republic of Germany has realized the need to include pedagogical media competencies into teacher training, as their according declaration on media education at school reveals (KMK 2012). Accordingly, there have been various attempts for such an integration over the last decades (Bentlage and Hamm 2001; Imort and Niesyto 2014). Nonetheless, there are no binding national obligations for institutions of teacher education as, due to the federal system in Germany, the responsibility for higher education institutions lies entirely with the individual federal states. Recently it can be recognized that in different federal states new educational policy guidelines and recommendations for media literacy have been published (for example in Bavaria: Stmbw 2016). As a result of these efforts, most German preservice teachers can but do not have to engage with media pedagogy in the course of their education. About 17% of all eligible German institutions of teacher education offer M.A. studies with an explicit focus on media pedagogy. The preservice teachers at these institutions can accomplish such studies in addition to their regular M.Ed. degree. With regard to contents, the focus of these media pedagogical studies varies. The field of teaching with media is addressed explicitly by most study programs (92%), followed by media-related school reform (33%) and media education (25%) (Tiede, Grafe, and Hobbs 2015).
In the USA, the new 2016 National Education Technology Plan lately issued by the U.S. Department of Education reinforced the call for a media pedagogical education of all preservice teachers, which is still not obligatory, and emphasized the responsibility of the institutions involved (p. 32-33). This plan refers also to the ISTE standards for teachers, issued by the International Society for Technology in Education, as a background. These standards describe a framework for the skills teachers should have regarding the educational use of media; they primarily address the field of teaching with media but also include media educational issues and professional development (ISTE 2008). Another important U.S. framework was developed by the National Association for Media Literacy Education, named the Core Principles of Media Literacy Education. These principles mainly focus on media educational aspects (NAMLE 2007). Like the ISTE standards, the NAMLE principles do not have to be adhered to mandatorily.
U.S. preservice teachers generally have few elective courses; hence, there is a larger number of mandatory courses with media pedagogical contents. Additionally, 52% of all eligible U.S. institutions of teacher education offer master's programs with an explicit focus on media pedagogy. These focus on teaching with media (76%), media-related school reform (23%) and media education (2%) (Tiede, Grafe, and Hobbs 2015). Unlike in Germany, preservice teachers can decide for such master's studies as part of their initial teacher certification, depending on individual regulations for each state.
As these observations from Germany and the USA indicate, the circumstances of the two countries are comparable to some extent. Both of them generally support and promote the integration of media pedagogy into teacher training and yet lack according national binding obligations. Consequently, preservice teachers in both countries can but usually do not have to study media pedagogical topics in the course of their education. Media pedagogy is included into teacher training either as elective courses as part of the basic education, as additional courses and certificates or as specific graduate studies (Tiede, Grafe, and Hobbs 2015).
Obviously, there are also differences between the two countries from a systemic point of view. To substantiate this observation, first results of a study will be presented in the following which sought to measure the pedagogical media competencies of preservice teachers from Germany and the USA. The development of a test instrument will be outlined with particular regard to the special requirements of cross-national research. Then, initial data will be introduced and analyzed.

Material and Methods
The M³K model of pedagogical media competencies A recent approach to defining pedagogical media competencies was made in the course of the German research project "M³K -Modeling and Measuring Pedagogical Media Competencies", funded by the Federal Ministry of Education and Research. This M³K model of pedagogical media competencies serves as a basis for the following study. As a starting point for its development, a broad range of primarily German, but also international literature was reviewed, particularly the works of Tulodziecki and Blömeke (1997;see also Blömeke 2000;) and their follow-ups (Siller 2007;Gysbers 2008). A first model was deductively derived from this theoretical basis, structured in dimensions and facets of competencies. In order to assess this structure and to further differentiate the facets, media pedagogical requirements for preservice teachers were surveyed empirically and inductively by means of qualitative semi-structured interviews with national and international subject matter experts (n=14) based on the critical incident method (Flanagan 1954;Schaper 2009). All interviews were recorded and transcribed. Based on qualitative methods of content analysis (Mayring 2000), the relevant aspects of pedagogical media competences were extracted and paraphrased. The next step emphasized the link between the identified elements of the paraphrased texts to the competencies dimensions previously identified deductively from literature research (Herzig, Martin, Schaper, and Ossenschmidt 2015).

Teaching with Media (MD)
Teaching about Media (ME)

Aspects of competencies
Understanding and assessing conditions Describing and evaluating theoretical approaches The model which was created this way defines pedagogical media competencies as an interplay of three main areas. The first one is media didactics, which means teaching with media or the design and use of media content for educational purposes. The second area is media education and addresses media-related educational and teaching tasks, such as ensuring the students' responsible behavior in online environments or teaching about ethical aspects of internet use. The third field is media-related school development; this refers to professional development and integrating media on a systemic level (Tulodziecki, Herzig, and Grafe 2010;Herzig, Martin, Schaper, and Ossenschmidt 2015;Tiede, Grafe, and Hobbs 2015).
The M³K model is designed as a matrix with the three main areas: media didactics, media education and school reform on the first axis. Five competency aspects form the second axis. These competency aspects are (a) understanding and assessing conditions, (b) describing and evaluating theoretical approaches, (c) analyzing and evaluating examples, (d) developing one's own theory-based suggestions, and (e) implementing and evaluating theory-based examples. Each field between the two axes is filled with two standards, as table 5 demonstrates.
The field between "Media Education" and "Describing and evaluating theoretical approaches" for example contains the following two standards: "Standard ME2.1: Student teachers are able to describe concepts of media education and related empirical findings appropriately" and "Standard ME2.2: Student teachers are able to assess concepts from an empirical, normative, or practical perspective" (Tiede, Grafe, and Hobbs 2015).

Developing a measuring instrument of pedagogical media competencies
Following the development of the model, a test instrument was designed to measure the competencies as defined before. The first items were developed based on theory and on findings from the expert interviews (n=14) as operalizations of the model facets and then tested for performance criteria (Herzig, Martin, Schaper, and Ossenschmidt 2015).
Further factors are understood to influence a successful educational use of media even if they are not defined as immediate constituents. This is true primarily for beliefs with regard to teaching with media, teaching about media and school development, perceived media related self-efficiency, and technological media knowledge (Blömeke 2005;Grafe and Breiter 2014). Test instruments were developed for these factors, too.
For the validation of the instruments, data were collected from students in teacher training programs at 11 different Germany universities. There were three major surveys with n 1 =591 test persons, n 2 =434 test persons and n 3 =919 test persons; after the first and second survey, the results were analyzed in detail and the instrument was revised thoroughly. Additionally, extensive pretestings, expert interviews and minor studies helped improve and validate the items.
The final version contains 16 items on media didactics / teaching with media, 14 items on media education, 10 items on school reform and 26 items on technological knowledge. These items are amended by 6 items on beliefs for each of the three main areas, 6 items for each of the three main areas that assess the perceived self-efficiency and some demographic data.
The validation of these items is still work in progress, and further work on the test instrument will be required to achieve entirely resilient results. According to the reliabilities determined in the final survey, 11 out of the 16 items on media didactics are suitable for further improvements and should be retained (α=.56), and the same is true for 12 out of 14 media education items (α=.60), 8 out of 10 school reform items (α=.46) and 19 out of 26 items on technological knowledge (α=.81). The reliabilities of the beliefs were α=.64 and the reliabilities of technological knowledge were α=.81 (19 out of 26 items) and of self-efficiency α=.87.

Adoption of the German M³K questionnaire to a US-American version
In order to use the M³K test instrument in an international context, a complex adoption process was necessary. As international sources were included in the process of developing model and instrument, the international connectivity was generally given; still, a number of steps had to be taken to guarantee comparable results. Their main goal was to ensure the same conditions for students of both countries. Therefore, a five-step approach was applied which mainly builds upon the Guidelines for Best Practice in Cross-Cultural Surveys (Survey Research Center 2011) and on Harkness and Schoua-Glusberg (1998): 1) Translation: two independent peer-reviewed translations were prepared by professional translators and a third advance translation was made by a competent member of staff; 2) Review: a preliminary translation was developed from the first drafts; 3) Adjudication I: an international expert was consulted, and decisions were made on issues which had been identified as controversial before; 4) Pretestings: an elaborate cognitive pretesting with another expert was made to ensure the cognitive validity of the translation, resulting improvements were applied to the translation and a first small test group of n=2 participants filled in an online version of the test; 5) Adjudication II: the translation was reviewed and discussed once more, changes were reconsidered and the adapted version was finally accepted as appropriate for the upcoming explorative international survey.

The German and US surveys: samples and method
For the international survey the following content areas were included: media didactics / teaching with media, media education, technological knowledge, beliefs and self-efficiency, and demographical data. It was decided to exclude school reform due to reasons of efficiency and manageability and to avoid potential difficulties with the cultural fit of this field which depends significantly on systemic aspects.
The study was designed as an "ex-post-facto" study since it was not possible to manipulate variables or randomize participants or treatments. Therefore, a descriptive, comparative and non-experimental, quantitative questionnaire-based approach was applied.
The US sample consisted of n=109 test persons who were aged 22 on average (SD=2.16). 11.21% were male. All of them were preservice teachers or students of related studies from one college and five public US universities. As for the procedure, the questionnaire was distributed both as a paper version and as an online survey between April and May 2015.
For the comparison, the data from the third major survey were included. This sample consisted of n=914 test persons aged 23 on average (SD=4.24). 35.52% were male. All test persons were preservice teachers from six different universities. The survey was conducted as a paper version in summer term 2014.
The international survey was one aspect of a greater project, so it was designed as an exploratory study. It served to open up a new comparative view but was not intended to reach the same range as the German main study, which is why the German and US test groups differed in size.

Results
For the descriptive comparative analysis, simple T-tests were used to calculate the means for all items separately for both samples. These means were then summarized as one mean value for each field and sample. The confidence interval was defined as 95%. In the following, the results will be introduced descriptively. An interpretation will be provided in chapter 8.2.4. As table 6 illustrates, the German means for all three fields (media didactics, media education and technological knowledge) are significantly higher than the US means. The highest difference can be found in the field of media education.

% of students with correct answers
In the field of media didactics, German students achieved higher results with items related to the following topics: films at school, the constructivist use of media in lessons, media didactic concepts, practice programs, computer simulations, computer learning programs, learning through films, behaviorism, and methods of empirical/quantitative research. Three items are opposed to this tendency, as US students achieved higher scores here. The first one requires skills in identifying and processing media influence (Tulodziecki 1997), the second one knowledge about using computer games for learning and the third one knowledge about the use of online forums for homework.
With regards to media education, German students had more success in answering a majority of the topics covered by the questionnaire. These topics are role models in the media, conservative pedagogical attitudes, age-specific media activities, consumption of violent media content, media use for the satisfaction of needs, developing media competencies and conditions of media production. One item contradicts the tendency described. US students were 29.5% more accurate than their German counterparts, which is a remarkably high difference. This item describes a scenario which requires expertise in the area of understanding and assessing conditions of media production and media dissemination (Tulodziecki 1997).
Also in the field of technical knowledge, German students answered a majority of questions with higher success. These items were about general functions of social networks, types of data, Google functions, internet browser, hot spots, meta search engines, computer hardware and software. Given this tendency, five items do not correlate because the US test group achieved higher results here. The two that show the highest differences between the test groups (20.7% and 65.4%) are concerned with knowing and using different social media.
With regards to beliefs, the results show that the German means are significantly higher than US means both in the fields of media didactics and media education. This indicates that the attitudes German students expressed concerning using media for these purposes were more positive; for example, they indicated to be more convinced of the usefulness of a media integration which allows students to independently approach lesson content, or they agreed less with the statement that students are already aware of manipulations inherent in media, which therefore need not be further addressed in the classroom.
The difference in self-efficiency is not significant, meaning that the German and the US study participants showed comparable confidence to be able to teach with and about media successfully; for example, both groups estimated their abilities to evaluate the quality of digital learning programs approximately equally.

Discussion and Conclusion
For the interpretation of these data, it has to be considered that the reliabilities of the test instrument still require further improvement. Moreover, the numbers of participants in the two groups compared are rather disproportionate. The results must not be understood as sound proofs of pedagogical media competencies but rather as tendencies that pave the way for further research.

Media didactics / teaching with media
All in all, the data show that the sample of German students had higher competencies in the field of media didactics / teaching with media than the students in the US sample. A possible explanation could be more relevant learning opportunities during their studies, but the students' self-reports do not support this thesis: comparable shares of German and US students claimed to have learned about teaching with media during the course of their studies (78.8% of German students vs. 77.8% of US students). Assuming that no confounding factors like different perceptions of the item text came into effect, another interpretation is that the quality and topical focus of the studies both test groups experienced were heterogenous and led to different shapes of competencies. Consequently asking for more details about the learning opportunities in future studies would be helpful for the interpretation of the differences in results.
With regards to an analysis on the level of items, some items oppose this trend of higher media didactical competencies on the part of the German participants, for example two of these items required competencies in using computer games for learning and in the use of online forums for homework. The results showed that the US sample achieved better scores with regard to these items, as they might have had more opportunities to gather experiences with computer games in class and forums for homework during their own schooldays. Empirical data on students' computer use support this assumption: in 2009, when a majority of the study participants was still at school, 88% of all US students were reported to use computers during instructional time in the classroom rarely, sometimes or often (Gray, Thomas, and Lewis 2010), while the percentage of German students who used the computer at school was as low as 64.6% (OECD 2015).

Media education
64.2% of all German participants indicated having had learning opportunities in the field of media education while the share of US students was 78.9%. Yet, German students had significantly more success in answering a majority of the media educational topics covered by the questionnaire. This observation substantiates the assumption made based on the findings in media didactics that the study content both test groups faced differs.
Noticeably, the two items with the largest difference in the answering pattern (with the means of German participants being 28.2% and 33% higher) contain the term media competencies. Despite the complex adoption process, terminology problems have to be regarded a possible explanation for these discrepancies: there are several ways to translate the German term "Medienkompetenz", and their precise definition differs according to their context. One team of translators decided on a direct translation and chose media competencies, which was accepted for the final version. Other terms are also frequently used, as for example media literacy (as suggested by the second team of translators), digital competence, digital literacy, or computer literacy (Røkenes and Krumsvik 2014). As the remarkably high discrepancies suggest, terminological differences of key terms in the field of pedagogical media competencies are a great challenge for the development of instruments that could work internationally.

Technological knowledge
Also in the field of technical knowledge, the German students answered a majority of questions with higher success. It has to be considered that technical knowledge depends on everyday knowledge to a higher degree than the fields of teaching with media and media education, given the omnipresence of media and their being part of our everyday life. Acquiring media literacy and technical knowledge may be part of teacher training, but it also takes place in informal learning processes. Hence, the interpretation seems likely that German students interact with media in other ways than US students do. This thesis of varying media use is substantiated by empirical data, for example with respect to social media: in the US, 76% of young people aged 13 to 17 reported using social media in 2014/15 (Lenhart 2015), while in Germany only 68.5% of young people aged 14 to 17 reported using social media in the same period of time, and 57% if the age group from 12 to 17 is considered (MPFS 2014). Consequently a great challenge when evaluating the success of teacher education programs on the development of pedagogical media competences and its dependent variables is to measure the informal learning processes. For this study it can be concluded that the integration of further items on informal media use would be helpful for the interpretation of results.

Beliefs and self-efficiency
According to Redman (2012), the perceptions of the affordances of new technologies are also shaped by students' experiences with these technologies: it was found out that, once the students in this study became acquainted with certain media, their perceptions shifted towards a more positive assessment. However, the German students in our study did not describe more learning opportunities than the US study participants but still showed higher means in the according beliefs. Hence, the correlation of experience and beliefs as argued by Redman (2012) could not be confirmed here.
Differences in the perceived self-efficiency of both groups are not significant. This observation is noteworthy since there is evidence that TPACK knowledge may be predictive of self-efficiency beliefs about technology integration (Abbitt 2011). Due to overlaps of TPACK and the M³K model, comparable results could be expected here, meaning that according to Abbitt's results (2011), German students should show higher self-efficiency beliefs because of their higher pedagogical media competencies which were measured in the study. Hence, further research will be necessary here with regard to potential confounding factors and other influences that may have led to this contrary outcome.

Conclusion
One important goal of this study was the adaptation of a nationally developed instrument for further use in other national contexts taking Germany and the USA as examples. Results show that the international comparative approach adds a number of challenges: while an elaborate adoption process sought to ensure comparability of the German and the US version, the basis was still developed by German scholars and influenced by a German background in terms of fundamental terminology and literature. The possibility that this background has an impact on the results cannot be ruled out and is a great challenge for cross-national studies in the field of media pedagogy.
With respect to these limitations, the overall results of the study suggest that the selected sample of German preservice teachers has slightly higher pedagogical media competencies than the sample of US students. According to their self-reports, German students did not have significantly more learning opportunities; as the differences in the competencies measured are still significant, the learning opportunities both groups had must have differed to some degree and led to more or different competencies. Supposedly, the topics within the field of media pedagogy that are covered in both countries vary. It has been previously established that, considering media pedagogy as an interplay of the three fields teaching with media, teaching about media (media education) and school reform, a majority of US study programs with explicit reference to media pedagogy focus on teaching with media and neglect the other two areas, while the respective German study programs show the same tendency but put more emphasis on media education and school reform (Tiede, Grafe, and Hobbs 2015). A transfer of these conclusions to the results of the study described in this paper leads to the assumption that the media pedagogical contents within teacher education of both countries could also differ and include a larger variety of topics within Germany. Therefore further research on a core curriculum of media pedagogical topics in teacher education would greatly assist further cross-national research in this field. Further research will be necessary to consolidate these assumptions and exploratory findings. Although a cross-national comparison inevitably holds a number of challenges (e.g., culture, history, focus, language, and background), it also has distinctive affordances, allowing for valuable insights by increasing the variety of viewpoints and providing a broadened, globally interconnected perspective. It opens up a variety of options for subsequent studies; elaborating on the differences between media pedagogy in German and US teacher training on the basis of the findings introduced here will bring about valuable insight into potential improvements of both systems. With regard to the varying focus of media pedagogy within teacher education, curriculum analyses and a comparative evaluation will help draw conclusions on the status quo. Based on the results introduced here, it can be assumed that there are in fact differences in the pedagogical media competencies of German and US preservice teachers, resulting from differences in the role, shape and focus of media pedagogy in the respective teacher education programs. However, taking into account that media pedagogy is not a mandatory part of teacher education in either country, both the USA and Germany are facing similar challenges and potentials for systemic improvement.

Main Conclusions from and Further Perspectives on Paper 1
A main outcome of the paper "Media Pedagogy in German and U.S. Teacher Education" is the conclusion that the German sample of preservice teachers has slightly higher "pedagogical media competencies" than the US sample. This conclusion builds on the observation that the overall results of the German sample in the survey presented were better than the results of the US sample. To enhance understanding, to contextualize and evaluate this conclusion and to draw valid conclusions for future uses, it is conducive to look at relevant results in greater detail. Hence, selected noteworthy items were analyzed again critically with regards to influential aspects such as translation and context. As a consequence of the results presented in the article, the following chapter amends these results and provides a broader perspective. For this purpose, another content matter expert was involved in a critical discussion, the findings of which will be presented in the following. The charts included for illustration purposes are based on data that were collected and analyzed jointly in the M³K project together with project partners.

Mediendidaktik/teaching with media
Out of sixteen items in the field of Mediendidaktik, German participants achieved better results in ten items. Three items were solved comparably well by German and US participants (difference < 5%), and US participants performed better in three items. Figure 5 illustrates the shares of correct answers per cohort comparatively. As the chart shows, a few items stand out with regards to the number of correct answers and thus deserve further consideration.

MD8
In a lesson on "Political Decisions and Their Effects", a politics teacher uses learning software which simulates how the initial situation of a fictitious state changes when the students assume the role of a government commission and invest points in selected areas, e.g. productivity or quality of life, which in turn influence conditions in other areas, e.g. politics or environmental pollution.
What are the main learning requirements that the learners must meet in order to realize the lesson successfully?

Knowledge of computer science ☐1
Argumentation abilities in political contexts ☐2

Knowledge of various forms of government ☐3
The ability to do networked thinking ☑4 Tab. 8.: Item MD8.
The share of German participants giving the correct answer here is about 36 % higher than the share of US participants (GER: 63.2 %; USA: 26.7 %). This is the largest difference between the results of German and US test groups throughout the whole survey. For German participants, this was one of the easier items, while it was obviously rather difficult for US participants. The feedback conversation which was conducted with a US expert after the survey administration revealed a possible explanation for this heterogeneity. The expert expressed that terminology in the correct answer option, "the ability to do networked thinking," was neither precise nor easy to understand. She herself had problems in comprehending and evaluating it, a problem which was not expressed with the German equivalent "die Fähigkeit zu vernetztem Denken." Since it can be assumed that the expert, being a native speaker, has advanced language proficiency and reading comprehension skills, it is possible that the US preservice teachers in the study also had problems understanding this item. However, it is remarkable that this difficulty was not identified throughout the elaborate translation process (cf. Chapter 8.1.2). Instead, it was a verbatim suggestion by one of the professional translation teams which did not evoke comments or a need for further adoptions in the ongoing translation process. The second translation suggested was "ability to think laterally," which is close to the idiom "thinking outside the box" and thus not totally congruent with vernetztes Denken, which led to the refusal of this alternative. This background leads to the assumption that the translation is correct but that the underlying concept is less familiar in the USA. In the German discussion, this concept was shaped, e.g., by Vester (1988;1996;. According to Ossimitz (2000), it is one of four constituents which make up systemic thinking, with the other three constituents being dynamic thinking, thinking in models, and system-appropriate acting (cf. Maierhofer 2001;Rieß and Mischo 2008). A corresponding deep exploration of this concept is not an equally established part of the US discourse, which might have led to the comparably low share of correct answers in the US sample.
Another peculiarity can be found with item MD16:

MD16
A teacher has used an educational software in a lesson unit. Before and after the lesson unit, she gathered empirical data about the student's degree of educational success by testing their knowledge, which she then compared with a control group. She would like to use the results for future teaching situations.

Which of the following statements is most accurate?
It is generally not possible to draw consequences for future activities from the data collected. ☐1 If the data confirms a positive learning outcome, the teacher can conclude that the concept tested will also be successful in all other classes. ☐2 From these results, the teacher can draw conclusions concerning the aspects of the teaching process that increased the students' learning success. ☐3 The data allow the teacher to evaluate whether the tested concept has led to learning progress for the students. ☑4 Tab. 9.: Item MD16.
The share of German participants giving the correct answer here is approximately 27 % higher than the share of US participants (GER: 68.3 %; USA: 41.6 %). The item requires declarative knowledge about methods of empirical research. Processes of translation and validation did not indicate problems with the translation here.
Hence, it seems a likely interpretation that German students had more chances to learn about methods of empirical research in their studies, compared to their US peers in the study. This assumption is substantiated by item 15, which is also about methods of empirical research and was solved correctly more often by German participants (difference: approx. 9 %; GER: 56.2 %; USA: 47.6 %). The research design of the study presented does not allow for conclusions regarding respective teacher education curricula. Hence, to verify this thesis it will be helpful to amend comparative curricula studies in order to achieve insight into the actual contents in teacher education with relation to empirical research.

Medienerziehung/teaching about media
Out of fourteen items in the field of media education, German participants achieved better results in eight items. Five items were solved correctly by comparable shares of German and US participants (difference < 5 %), and US participants performed better in one item. Also in this field, some items need to be reconsidered. Figure 6 displays the shares of correct answers per cohort in this field. Out of the eight items that were solved correctly more often by German participants, the largest deviations between German and US results were found in items 5 (approx. 28 % difference: GER: 64.4 %; USA: 36.2 %) and 6 (approx. 33 % difference: GER: 63.1 %; USA: 30.1 %):

ME5
Media education offers different approaches and basic attitudes to dealing with media. One of these approaches stipulates that children and youth should have largely unrestricted media access so that they can develop into competent media users either by themselves or with teacher assistance.
Which of the following statements most closely reflects a comprehensive understanding of media competencies? It is noteworthy that both items contain the term media competencies. Against the background of the findings from the previous chapters, one interpretation is that terminology is problematic in these cases. As discussed above, media competencies is a term less common in the US context compared to Medienkompetenz in Germany, and concepts and understanding can be expected to differ between the two countries and languages. This conceptual ambiguity illustrates the challenges of international research and of cultural adaptations that go beyond the semantic level and delimit the informative or comparative value of these two items. It leads to the question of how to meet these challenges in order to achieve a comparable result. Hypothetical approaches in relation to the two items mentioned above might include a definition of "media competencies" or analyze options for exchanging "media competencies" with the concept of "media literacy." Again, the concept of "media literacy" is not totally congruent with Medienkompetenz, but a respective change would do justice to the methodological claim of prioritizing functional equivalence over literal translations (Peschar 1982;Harkness 2008).
Another striking item can be found with ME8 where there is a deviation of about 30 % in favor of the US survey participants (GER: 15.3 %; US: 44.8 %):

ME8
Students are addressing the topic of "news" in a school class. For this purpose, they form small groups that represent public and private television broadcasting companies. They are presented with specific background conditions about their broadcasting companies, and assume the role of broadcast editors. In this role, the students will decide which news report, among a variety of news reports, they will present as the top story for a particular day. They present their decisions, concepts and justifications to the class and compare them with news that has actually been broadcasted.

Which media-educational goals are primarily addressed in this example?
Students should learn… … to distinguish between serious and less serious design concepts for news reporting. ☐1 … to assess the economic, personal and organizational conditions of the production and distribution of news. ☑2 … to assess the subjectivity in the selection and distribution of news by journalists. ☐3 … to distinguish between the frequency of news about an event and its actual societal relevance. This item stands out because it is the only item from the field of media education showing this tendency. For German participants, it was the most difficult item while it was of medium difficulty for the US test group, compared to the other items in this field. Possibly, the scenario described in the item is more familiar to US preservice teachers, or that competency area of "understanding and assessing conditions of media production and dissemination," which is addressed by this item, has been a topic of higher relevance in the past for the US participants in the study. It is noteworthy in this context that this thematic area is a central concern for the research field of media literacy, which, according to Culver and Redmond (2019), is achieving growing public awareness in the US. The authors describe increasing efforts within the US to integrate respective contents into initial teacher programs even though the status is still perceived as unsatisfying. Against this background, the higher success of US preservice teachers with this item can be read to indicate a successful education of preservice teachers with regards to this field of media literacy, which, as will be argued in Part III, might not have a direct equivalent in the German tradition of media education in teacher education. Further comparative studies would be helpful to substantiate this thesis.
Overall, these additional remarks on a number of noteworthy items point out challenges in the research methodology, for example with regards to equivalence of translations or equal conditions for understanding concepts. They also offer ideas for a more suitable evaluation of the study results on the microlevel of items in some cases. However, there are also major influences to be considered for an overall evaluation and conclusion. First of all, the informative value of the survey is restricted by the non-finalized validation of the measurement instrument. The challenges in validating the instrument are connected to Endberg's (2018) criticism of lacking empirical evidence in the German tradition of media pedagogical research. While the reasons for these problems are difficult to pin down, a first and obvious observation is that the complexity of media-related educational competencies as a construct poses serious challenges for objective measurement. Furthermore, missing learning opportunities for media-related educational contexts within the German system of teacher education are a central issue in this regard (Herzig et al. 2015;Herzig and Martin 2018;cf. Chapter 7.3), and the analysis of results from the US sample implies that this problem is not limited to the German context at all. To clarify the conditions and circumstances of these learning opportunities, the following third part of this dissertation will analyze respective practices in teacher education in Germany and the USA. Yet it remains a research desideratum to analyze in greater depth how far the German empirical research tradition from the perspective of media pedagogy can take benefit from professionalization research and if the critique brought up by Endberg (2018) does justice to the empirical research approaches provided by German media pedagogical research. After all, this critique appears questionable especially against the background of other respective studies e.g. from the field of Medienkompetenz in which specific aspects of Medienkompetenz have been operationalized successfully. Examples for such aspects include Mediale Zeichenkompetenz [i.e., the competencies required to understand symbolic representations in media such as pictures or auditive signals] (Möckel 2013;Nieding and Ohler 2008) or information and computer literacy (Bos et al. 2014; cf. Chapter 2).
As a result, to evaluate the results from the study, it is necessary to consider the study participants and to understand the role of learning opportunities within their study career. The conclusion proposed in the article, suggesting that the "pedagogical media competencies" of German students are slightly higher in comparison to those found in the US sample, was drawn against the background of assumed differing learning opportunities of students from both countries, which may have caused different occurrences of respective competencies. The exploratory study design adds to the challenges connected to the cohort, e.g., with regards to the clearly disproportionately sized two national samples for the comparison, or the non-finalized instrument as pointed out above. From these conclusions, a research desideratum has to be deduced of optimizing study conditions and circumstances in future studies to enhance comparability and informative value of data collected in this regard.
The additional critical reflection and analysis of selected items reveal a new perspective on the overall context of measuring media-related educational competencies. A central outcome of the theoretical analysis of competency modeling in Part I was the conclusion that national models of media-related educational competencies are strongly tied to their national backgrounds. M³K in particular defines the competencies that German preservice teachers are expected to acquire in the course of their teacher education program. As described, this excludes certain facets and highlights others, corresponding to the characteristics of German initial teacher education. Obviously, applying such a model with its national implications to another national context can be critical because the circumstances of the respective comparative teacher education program must be expected to lead to other emphases, expectations and occurrences of media-related competencies. In Paper 1, this aspect is addressed: "While an elaborate adoption process sought to ensure comparability of the German and the US version, the basis was still developed by German scholars and influenced by a German background in terms of fundamental terminology and literature. The possibility that this background has an impact on the results cannot be ruled out and is a great challenge for cross-national studies in the field of media pedagogy." (Tiede and Grafe 2016, 26) However, the conclusion drawn in the paper in the light of this limitation still assumes higher pedagogical media competencies on the side of the German sample. Now, taking into account the additional critical analysis presented above, this conclusion should be rephrased in favor of an important emphasis: the results of the study suggest that the selected sample of German preservice teachers has slightly higher Medienpädagogische Kompetenzen/media-related educational competencies in the sense of the German M³K model than the sample of US students. This addition points to the central role that the underlying model and measurement instrument play for the conclusion.
Based on these considerations, the applicability of the German measurement instrument to a US context appears questionable. However, it has been pointed out that there are other national models which are successfully applied and operationalized by respective measurement instruments in a number of different national settings. TPACK, for example, which has been suggested to be rather basic in terms of level of detail and depth, has been successfully used as a reference, applied and measured all over the world (Crompton 2015;Martin 2015;Schmidt et al. 2009;Tondeur et al. 2017;Sang et al. 2016). Hence, it will be beneficial for international comparative studies to select model and measurement instruments with care and to consider their potential for transnational applicability as a selection criterion. Furthermore, international competency models such as DigCompEdu should be considered in this context. As described in Part I, the development process of such models takes into account models, guidelines and frameworks from multiple countries. Obviously, they are geared towards consent and exclude peculiarities of single countries. As a result, models like DigCompEdu are explicitly intended for an international application and might be more appropriate to apply in such a context. These considerations go beyond the scope of Paper 1, which was written in the course of the M³K project. Its intention was to expand and add a further perspective to the German main pilots and to learn more about the applicability of the M³K measurement instrument. To this degree, the results also can be read to suggest that the instrument apparently depicts media-related competencies acquired in German teacher education more appropriately than those acquired in US initial teacher education programs. With regards to the measurement of competencies as suggested by the M³K competency model, a triangulation of methods, e.g., by combination with further qualitative measures, would be desirable. Furthermore, it would be worthwhile to amend these conclusions by further studies with measurements based on TPACK, DigCompEdu and other suitable instruments to confirm actual differences between the media-related competencies of German and US preservice teachers.