Skip to main content

Table 2 Comparison of diagnostic performance for PTs and FAs among 6 radiologists, and between radiologists with and without PTs-HDM assistance

From: Hierarchical diagnosis of breast phyllodes tumors enabled by deep learning of ultrasound images: a retrospective multi-center study

 

AUC

Accuracy

Sensitivity

Specificity

PPV

NPV

F1-score

PTs-HDM

0.883 (0.831, 0.927)

87.3 (82.1, 91.9)

92.3 (84.9, 98.4)

84.3 (76.6, 90.4)

77.9 (68.1, 86.7)

94.8 (89.8, 98.9)

84.5 (77.5, 90.3)

Senior 1

0.872 (0.821, 0.925)

90.2 (86.1, 94.2)

75.4 (65.1, 85.7)

99.1 (97.1, 100.0)

98.0 (93.3, 100.0)

87.0 (81.0, 92.8)

85.2 (78.1, 91.7)

Senior 1+

0.880 (0.826, 0.931) ↑

90.8 (86.7, 94.8) ↑

76.9 (65.7, 87.0) ↑

99.1 (96.8, 100.0)

98.0 (93.2, 100.0)

87.7 (81.7, 93.2) ↑

86.2 (78.4, 92.6) ↑

Senior 2

0.712 (0.647, 0.780)

75.1 (68.8, 81.5)

55.4 (43.8, 67.5)

87.0 (80.4, 93.2)

72.0 (59.6, 84.1)

76.4 (69.2, 83.7)

62.6 (51.9, 72.3)

Senior 2 + AI

0.817 (0.755, 0.880) ↑

84.4 (79.2, 89.6) ↑

70.8 (59.7, 82.4) ↑

92.6 (87.0, 97.2) ↑

85.2 (74.6, 94.2) ↑

84.0 (77.8, 90.3) ↑

77.3 (68.4, 85.7) ↑

Senior Mean

0.792 (0.734, 0.853)

82.7 (77.5, 87.9)

65.4 (54.5, 76.6)

93.1 (88.8, 96.6)

85.0 (76.5, 92.1)

81.7 (75.1, 88.3)

73.9 (65.0, 82.0)

Senior Mean+

0.848 (0.789, 0.906) ↑

87.6 (83.0, 92.2) ↑

73.9 (62.7, 84.7) ↑

95.8 (91.9, 98.6) ↑

91.6 (83.9, 97.1) ↑

85.9 (79.8, 91.8) ↑

81.8 (73.4, 89.2) ↑

Attending 1

0.507 (0.441, 0.585)

56.1 (49.1, 64.2)

29.2 (18.9, 41.0)

72.2 (63.5, 81.0)

38.8 (25.0, 54.2)

62.9 (54.6, 71.7)

33.3 (22.0, 45.0)

Attending 1 + AI

0.629 (0.556, 0.695) ↑

67.1 (60.1, 73.4) ↑

46.2 (33.3, 58.5) ↑

79.6 (71.7, 86.5) ↑

57.7 (44.4, 70.2) ↑

71.1 (62.7, 78.8) ↑

51.3 (39.2, 61.0) ↑

Attending 2

0.775 (0.713, 0.839)

79.2 (73.4, 85.0)

70.8 (59.5, 81.7)

84.3 (76.8, 90.8)

73.0 (61.0, 84.1)

82.7 (75.7, 89.3)

71.9 (62.6, 80.0)

Attending 2 + AI

0.846 (0.787, 0.904) ↑

86.1 (80.9, 91.3) ↑

78.5 (68.1, 87.9) ↑

90.7 (84.9, 96.1) ↑

83.6 (74.6, 92.6) ↑

87.5 (81.7, 93.2) ↑

81.0 (72.9, 88.1) ↑

Attending Mean

0.641 (0.577, 0.712)

67.7 (61.3, 74.6)

50.0 (39.2, 61.4)

78.3 (70.2, 85.9)

55.9 (43.0, 69.2)

72.8 (65.2, 80.5)

52.6 (42.3, 62.5)

Attending Mean+

0.738 (0.672, 0.800) ↑

76.6 (70.5, 82.4) ↑

62.4 (50.7, 73.2) ↑

85.2 (78.3, 91.3) ↑

70.7 (59.5, 81.4) ↑

79.3 (72.2, 86.0) ↑

66.2 (56.1, 74.6) ↑

Resident 1

0.744 (0.675, 0.813)

75.7 (69.4, 82.1)

69.2 (58.3, 81.0)

79.6 (71.7, 86.9)

67.2 (54.8, 78.8)

81.1 (73.5, 88.6)

68.2 (58.6, 77.1)

Resident 1 + AI

0.871 (0.817, 0.921) ↑

87.3 (82.1, 91.9) ↑

86.2 (78.0, 93.8) ↑

88.0 (81.5, 93.5) ↑

81.2 (71.1, 90.0) ↑

91.3 (86.0, 96.3) ↑

83.6 (76.3, 89.7) ↑

Resident 2

0.538 (0.478, 0.614)

52.6 (44.7, 60.7)

58.5 (45.7, 70.5)

49.1 (40.0, 59.4)

40.9 (31.4, 50.5)

66.3 (55.6, 76.9)

48.1 (38.6, 57.3)

Resident 2 + AI

0.781 (0.718, 0.840) ↑

76.9 (70.5, 82.7) ↑

83.1 (73.3, 92.1) ↑

73.1 (64.4, 81.1) ↑

65.1 (54.8, 74.7) ↑

87.8 (80.4, 94.1) ↑

73.0 (64.4, 81.5) ↑

Resident Mean

0.641 (0.577, 0.714)

64.2 (57.1, 71.4)

63.9 (52.0, 75.8)

64.4 (55.9, 73.2)

54.1 (43.1, 64.7)

73.7 (64.6, 82.8)

58.2 (48.6, 67.2)

Resident Mean+

0.826 (0.768, 0.881) ↑

82.1 (76.3, 87.3) ↑

84.7 (75.7, 93.0) ↑

80.6 (73.0, 87.3) ↑

73.2 (63.0, 82.4) ↑

89.6 (83.2, 95.2) ↑

78.3 (70.4, 85.6) ↑

  1. The data in brackets represent the 95% confidence intervals PPV, positive predictive value; NPV, negative predictive value; + indicates with PTs-HDM assistance The upward arrow (↑) represents indicators that improved owing to PTs-HDM assistance