Skip to main content

Table 4 Indices of the quality of SIT-PE items evaluated based on the 1PL IRT analysis

From: Towards a science inquiry test in primary education: development of items and scales

Itema

Typeb

Measurec

Model SEd

Mean infite

Mean outfite

Corr.f

Exp. Corr.g

Estim. Discrim.h

An1_k1

5

− 0.70

0.04

1.07

1.08

.28

.40

0.82

An1_k6

4

− 0.97

0.05

0.96

0.94

.37

.31

1.17

An1_s4

5

− 1.20

0.04

1.47

1.42

.51

.40

0.43

An2_s1

5

0.09

0.04

0.94

0.94

.31

.39

1.06

An2_v3

1

0.31

0.05

1.07

1.13

.18

.29

0.77

An3_k5

2

0.62

0.03

1.36

1.38

.49

.47

1.13

An3_k7

5

0.67

0.04

0.99

1.04

.25

.36

0.94

An3_s5

2

1.29

0.04

1.34

0.96

.48

.38

1.17

Pl1_k2

4

− 0.37

0.05

1.00

0.99

.32

.31

1.03

Pl2_k3

2

− 0.23

0.03

0.75

0.81

.47

.52

0.74

Pl2_k4

5

− 0.08

0.04

0.93

0.94

.33

.39

1.07

Pl2_s2

2

0.01

0.03

0.56

0.56

.60

.51

1.23

Pl3_p2

3

0.50

0.04

0.96

0.96

.42

.38

1.06

In1_r3

5

− 0.74

0.05

1.05

1.04

.52

.40

0.98

In2_m1

5

0.04

0.03

0.96

0.95

.50

.39

1.12

In2_s3

5

0.29

0.05

0.71

0.73

.40

.38

1.40

In3_m2

2

0.87

0.03

0.58

0.65

.51

.44

0.82

In3_r1

1

0.39

0.04

0.94

0.93

.36

.27

1.18

In3_r2

2

− 0.06

0.04

1.12

1.15

.44

.52

1.36

Kn1_t1

4

− 0.89

0.04

0.89

0.87

.45

.31

1.45

Kn1_v2

2

0.66

0.04

0.91

0.91

.47

.47

0.68

Kn2_t2

4

− 0.99

0.03

1.02

1.03

.28

.31

0.91

Kn2_v1

2

0.39

0.05

1.71

1.81

.38

.49

0.75

Kn3_p1

5

0.11

0.03

0.90

0.91

.34

.39

1.12

  1. Values that are outside of accepted values are marked in bold
  2. aItem names are codes where the first two letters show the dimension assessed (An, analytical skills; Pl, planning skills; In, interpretation skills; Kn, knowledge), the number next to them shows the level of the skill assessed (1, basic; 2, medium; 3, high), the letter next to the underscore indicates the task, and the number at the end is the number of the item in the task
  3. bItem type: 1, multiple-choice question with only one correct option (radio button); 2, open-ended question coded with 2 or more points (partial credit item); 3, forming a sequence of phases (multiple choice); 4, open-ended question coded dichotomously (correct or incorrect); 5, multiple-choice question with more than one correct option (check boxes, number of correct ones has been defined, e.g., 2 out of 5; selection from a number of pictures)
  4. cItem difficulty measure
  5. dStandard error of the item difficulty measure
  6. eMean infit and mean outfit refer to Infit MNSQ and Outfit MNSQ indices, which suggests how well students’ scores estimated based on item difficulty and students’ ability fit to real student scores (items with infit and outfit indices in the range of .7–1.3 are considered good items)
  7. fCorrelation between item score and respondent ability score estimated based on the 1PL IRT model (items with a correlation higher than .20 are considered satisfactory items; items with a correlation higher than .30 are considered good items)
  8. gExpected correlation between each item score and respondent ability score
  9. hEstimated item discrimination, i.e., what would be the item discrimination index if the data were analyzed by a 2PL IRT model (items with estimated discrimination in the range of .5–2.0 are considered items with satisfactory discrimination)