ARTICLE Developmental Kathleen FOCUS E. Gilbride, Testing PhD* QUESTIONS 1. What role do developmental screening tests have in the monitoring of growth and development in pediatric practice? 2. What precautions must be taken in the interpretation of the results of developmental screening tests for individual children? 3. How do the reliability, validity, sensitivity, and specificity of developmental tests affect the interpretation of their results? 4. What developmental screening and full-scale developmental tests are available? What are their strengths and weaknesses? Development during infancy and childhood is a dynamic process that generally unfolds in a predictable sequence. The age at which specific milestones are achieved may vary within a given range, with spurts and lags in development occurring commonly. Parents frequently turn to pediatricians when questions and concerns about their child’s development arise. The pediatrician can help the parents determine whether their child is experiencing a temporary lag in a specific area (eg, late walking), a more serious developmental delay or disability (eg, retardation or cerebral palsy), or a significant behavioral problem (eg, hyperactivity, school failure). The pediatrician can use ongoing developmental surveillance to monitor a child’s progress, with developmental screening and testing as supplementary tools to screen for and diagnose problems, initiate early intervention, and design appropriate treatment. Developmental Surveillance Developmental surveillance refers to an ongoing process of monitoring a child’s developmental status at each pediatric visit and includes taking a thorough history, reviewing developmental milestones, making skilled Assistant Professor of Pediatrics, University of California at Los Angeles, Harbor-UCLA Medical Center, Torrance, CA. 338 observations of the child, and eliciting parental concerns.’ This process provides an understanding over time of that child’s developmental trajectory. It allows the pediatrician to observe the child’s rate of development, temperamental style, and emotional adj ustment to developmental phases (eg, separation, independence). In addition, the pediatrician learns about the parents’ knowledge and attitudes about parenting and the nature of the parent-child interaction. This is particularly important information because it provides the backdrop against which future interpretations and decisions about the child’s behavior and development will be made. Developmental surveillance also helps the pediatrician to build a trusting relationship with the parent, which is critical in the event that delays or difficulties arise that require intervention. Parents who believe that the pediatrician knows their child well are more likely to work with the pediatrician when intervention is needed. Developmental Screening During the course of developmental surveillance, screening tests can be used to assess the child’s developmental status periodically. These tests are meant to be brief, easily administered measures designed to assess the child’s current developmental functioning globally compared with a standardized sample of children of the same age. The purpose of such tools is to identify children who have delays (which would mean that it has high sensitivity), while accurately classifying others who do not have delays (which would indicate that it has high specificity). Although developmental screening tests are too time-consuming to be included at every health care visit or with every child followed, using them at key ages (eg, 6, 9, 12, and 18 months and 2, 3, 4, and 5 years) is a useful way of monitoring the achievement of developmental milestones, especially with children at increased risk of delay. In fact, re- peated measures are needed to form an accurate picture of the child’s developmental progression because occasional lags are common in normal development. A single measure at a given time may be confounded by several factors (eg, fatigue, illness, and lack of cooperation), whereas multiple measures will yield more reliable results. Developmental screening also provides parents with periodic feedback about their child’s progress and opportunities to discuss developmental concerns. Parents may lack knowledge of child development; the developmental screening tool can be used to educate them about the normal progression of development. This can be especially helpful with parents who have inappropriate developmental expectations for their child or for those parents who are anxious about their child’s mastery of deve]opmental tasks. Screening tests can educate the pediatrician as well because subjective impressions often can be maccurate. By using screening tools systematically, the physician increases his or her knowledge of normal child development and experience with developmental stages, which can, in turn, improve the identification of delays. Periodic developmental screening that aids in the early detection of delays is particularly relevant since the passage of Public Laws 99-142 and 99-457. PL 99-142 requires states to provide an appropriate education to all school-age children regardless of handicapping condition; PL 99-457 requires the states to extend these services to children 3 to 5 years of age who are referred with developmental disabilities and to provide early intervention to infants and toddlers who are at increased risk of having developmental delays. Infants or children of preschool age who have disabilities most likely will be identified by pediatricians because they are not yet in the education system. Periodic developmental screening can help the pediatrician to identify delays early, allowing par- Pediatrics in Review Vol. 16 No. 9 Downloaded from http://pedsinreview.aappublications.org/ at University of Connecticut on May 26, 2015 September 1995 . CHILD DEVELOPMENT Developmental Testing , than any other screening instrument in the field. Reliability studies have shown this tool to have high interrater reliability (ie, consistency across raters) and test-retest reliability (ie, consistency over short periods of time). In terms of validity, the DDST has been useful in identifying global delay and mental retardation, but children who have mild developmental delays or speech and Ianguage difficulties often were not identified. As a result, it has been criticized for its limited predictive validity and low sensitivity, that is, it identified some, but not all children who had delays. The restandardized DDST II is improved in this respect, with greater sensitivity (ie, it classifies a higher percent of children as being at risk for delays), but it appears to have lower specificity (ie, it is less accurate in identifying children who do not have ents to gain access to intervention services as early as possible. DENVER SCREENING DEVELOPMENTAL TEST II The most common screening measured used by pediatricians is the Denver Developmental Screening Test (DDST), which recently has been restandardized as the DDST 11.2 The restandardization was based on a sample of 2096 children in the Denver area who were stratified by age, race, socioeconomic status, and residential area. It is designed for ages 2 weeks to 6 years and consists of 125 items grouped into four areas: Personal-Social, Fine MotorAdaptive, Language, and Gross Motor. It can be administered easily and scored in 15 to 20 minutes. Items are scored as either passed or failed, and profile scores are interpreted as being normal, questionable, abnormal, or untestable in accordance with the number of passes in each of the four areas. No separate scores are given for each area. The DDST has been a successful screening instrument because of its easy-to-use format and practical use in both clinical and research settings. In addition to the DDST II, the Revised-Denver Prescreening Developmental Questionnaire (R-DPDQ) is available. It is a checklist of 97 items drawn from the DDST and can be completed by parents in about 10 minutes. Items are arranged in chronologic order based on the ages at which 90% of the standardization sample were able to perform an item. It has been suggested as a first step in the screening process to determine areas of concern, with follow-up with the DDST if either two delays are evident on initial screening or one delay persists on two separate occasions. Screening tests, however, like all psychometric measures, have limitations of which the clinician should be aware. Most importantly, the reliability and validity of a screening instrument should be known. Because little information is available about the reliability and validity of the R-PDQ, it should be used very cautiously. The DDST and the DDST II have been researched more extensively Pediatrics in Review Vol. /6 No. 9 tool to identify risk, with made for more extensive assessment when indicated. OTHER DEVELOPMENTAL SCREENING TOOLS Other screening measures that appear to have high levels of sensitivity, specificity, reliability, and validity include the Minnesota Child Development Inventory (MCDI) and the Battelle Developmental Inventory Screening Test.3’4 The MCDI consists of 320 empirically derived questions about development and behavior that are grouped into seven scales: Gross Motor, Fine Motor, Expressive Language, ComprehensionConceptual, Situation Comprehension, Self-Help, and Personal-Social. A General Development Scale is composed of items from the other seven scales. It is designed for children ages I to 6 years and can be The DDST is used better as a measure of early than as an assessment of school-age functioning. delays). As a result, it has a higher percentage of false-positives or children who do not have delays but who are identified as being at risk. This also lowers its predictive validity in terms of ability to predict future outcome. Other potential problems with the DDST II are common for many developmental screening instruments. For example, several items are based on parental report and not on direct observations. If a parent is an unreliable reporter, the outcome will have poor reliability. At the younger age levels, few items are related to later functioning; thus, it is used best as a measure of current developmental functioning rather than as a measure to predict later outcome. At the older age levels, a limited number of items are relevant to school functioning, yet clinicians may use it to predict school functioning or school readiness. It probably is used better as a measure of early development than as an assessment of school-age functioning. Finally, as with all screening instruments, it should not be used for diagnosis, but rather as a September referrals diagnostic development completed by the parent in 30 minutes. Scores are categorized as either normal, ‘delayed, or Severely delayed.’ with age levels provided for each scale. The MCDI has high sensitivity and specificity; however, because it is based on parental report, it is subject to some of the same limitations in reliability and validity as the DDST. Its major drawback is the standardization sample on which the norms are based. The standardization sample comprised 796 middle-class Caucasian children from Bloomington, Minnesota. The lack of diversity in terms of race, geographic region, and socioeconomic status limits its generalizability to children who do not fit this demographic profile. The Battelle is designed for children 6 months to 8 years old and takes 10 to 30 minutes to administer, depending on the child’s age. It consists of 96 items taken from the 341 items in the full battery, which fall into the categories of Adaptive, Motor, Cognitive, Communication, and Personal-Social skills. Raw scores are converted into T-scores (mean = 50, ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ 1995 Downloaded from http://pedsinreview.aappublications.org/ at University of Connecticut on May 26, 2015 339 CHILD DEVELOPMENT Developmental Testing standard deviation = 10) or Deviation Quotients (mean = 100, standard deviation = 15), and guidelines for referral are given based on subtest scores. The standardization sample of 800 children is based on 1981 US census data stratified by race, sex, and geographic region, with a middIe-class emphasis. It has high testretest and interrater reliability, although like other screening measures, some items depend on parental report, which will influence its reliability. Validity data indicate high concurrence with other standardized measures, and it differentiates clinical from nonclinical samples, with high sensitivity and specificity. Predictive validity is limited at the younger age levels, as with most other screening tools. The Early Screening children 4 to 6 years of failing in school. Inventory old who Several other screening measures are available to identify specific areas of functioning. For example, the Clinical Linguistic and Auditory Milestone Scale (CLAMS) and the Early Language Milestone Scale (ELMS) are designed for screening language development, the Home Screening Questionnaire (HSQ) can be used to identify high-risk factors in the home, and the Pediatric Symptom Checklist provides a way of screening for behavioral or emotional problems.5 These tools are useful when the pediatrician wants to screen for a particular area of developmental functioning. SCHOOL READINESS SCREENING In addition to the previously mentioned screening tests, the pediatrician should be familiar with school readiness screening tests. School readiness tests differ from developmental screening tests by focusing on a narrow range of abilities that are specific to kindergarten or first grade. Typically, they include cognitive, language, and fine motor tasks as well as observations of the child’s attention span and social skills (eg, ability to attend, cooperate, and follow directions). They are not designed to provide a diag340 nostic evaluation of the child’s developmental functioning. Parents may ask the pediatrician to assess whether their child has the specific skills needed to begin school or they may approach the pediatrician after the school has assessed their child’s readiness. Understanding what readiness tests are designed to do and not to do is essential to helping the family make appropriate decisions regarding school placement. Because school readiness tests are intended to measure preparedness for academic achievement, their best purpose is for instructional planning to identify areas that should be addressed in the kindergarten curriculum (such as learning letters, letter-sound combina- is designed to identify are at increased risk tions, numbers). Most children who have severe delays will have been identified before school entrance, so these tests screen for milder problems that affect school performance. Some schools use these instruments inappropriately as admission or diagnostic tests to delay the start of school or to keep a child in kindergarten a second year. A low score on a readiness screening test may be interpreted by the school as mdicating that the child is immature and schooling should be delayed 1 year, when, in fact, school enrollment may be exactly what the child needs, with placement in an educational setting that is appropriate for his or her learning difficulties. In such cases the pediatrician can advocate for a thorough evaluation because decisions about placement should not be based solely on the results of a single readiness test. This is especially true given that many readiness tests have significant limitations in reliability, validity, standardization sample, and the purpose for which they were designed. A thorough assessment for school readiness should include a review of the medical history, a physical examination with a brief neurologic screening, and a vision and hearing Pediatrics screening. In addition, a screening test that is specific to the skills required in the school should be used rather than a global developmental screening tool. For example, the Early Screening Inventory (ES!) is designed specifically to identify children 4 to 6 years old who are at increased risk for failing in school. failure. It assesses abilities related to a child’s potential for acquiring knowledge rather than just his or her current skill achievement. The ESI is administered individually in 15 to 20 minutes, with a total score based on items computed in three sections: VisualMotor/Adaptive, Language and Cognition, and Gross Motor/Body Awareness. Recommendations are provided based on the total score and fall into one of three categories: 1) “OK” for scores that range within 1 standard deviation of the mean; 2) “Rescreen” for scores from 1 to 2 standard deviations below the mean, with rescreening recommended in 8 to 10 weeks; and 3) “Refer” for scores 2 or more standard deviations below the mean, with referral for diagnostic testing. The ES! has been shown to have high test-retest and interrater reliability. It has satisfactory validity when compared with concurrent measures, and long-term validity (through third grade) has been shown to be high because it identifled the majority of children accurately who presented later having learning problems. Thus, it has high sensitivity, with appropriate specificity as well. The ES! was standardized on 465 primarily Caucasian children from low to lowermiddle class urban homes. Therefore, it may be limited in its applicability to other samples that have different racial, socioeconomic status, or geographic characteristics. SUMMARY Many screening tools are available to the pediatrician; the ones described here are just a sample of some that are used commonly. Understanding the purpose and limitations of each screening measure, in terms of reliability, validity, sensitivity, specificity, and the population for which it was designed, will help the pediatriin Review Vol. 16 No. Downloaded from http://pedsinreview.aappublications.org/ at University of Connecticut on May 26, 2015 9 September 1995 CHILD DEVELOPMENT Developmental Testing cian use these tools effectively to identify children at risk for developmental delay. Most importantly, the pediatrician should view screening tests as one source of information to be used in an overall strategy of developmental surveillance that includes careful historical review, systematic clinical observations, and solicitation of parental concerns and attitudes. Developmental Testing Once the pediatrician identifies a delay or suspects a developmental problem, the child should be referred for formal developmental assessment. Such a referral should be discussed with the parents by inquiring first whether they have similar concerns regarding their child’s development. Many parents will have voiced these concerns already, or if they haven’t, they may have been silently worried that something was wrong and readily agree to pursue developmental assessment. It can be explained to parents that a more extensive developmental assessment can clarify the child’s strengths and weaknesses and identify areas that would benefit from additional stimulation or early intervention. The pediatrician should emphasize that a single developmental assessment does not determine a child’s diagnosis or long-term prognosis; rather, it provides more information about the child’s current status, which the pediatrician and parents can use in deciding about appropriate intervention. Most developmental assessments are conducted best by a developmental or child psychologist, a “developmental” pediatrician, or other professional person who has received special training in the evaluation of young children. DEVELOPMENTAL TESTS The most psychometrically sound instruments for developmental testing are the Bayley Scales of Infant Development and its recent revision, the Bayley II. ‘#{176} The Bayley was designed for children ages 2 to 30 months and was standardized on 1262 infants stratified by age, geographic area, gender, race, and educaPediatrics in Review Vol. 16 No. 9 September tion of parent. It takes approximately 45 to 60 minutes to administer and requires a high level of training. The Bayley II extends the age range coyered to 1 to 42 months and was restandardized on I 700 children who were stratified according to age, gender, race/ethnicity, geographic region, and parent education, based on 1988 US census data. New items were added, and some items were modified or deleted. Both versions of the Bayley are well-standardized measures that consist of a Mental Scale, a Psychomotor Scale, and an Infant Behavior Record (referred to as the Behavior Rating Scale on the Bayley II). Standardized scores that have a mean of I 00 and a standard deviation of 16 (15 for the Bayley II) are produced for the Mental and whereas after 2 years of age. language becomes very important for predicting future outcome. The Bayley does not correlate highly with later measures of intelligence because those measures emphasize language and abstract reasoning. Accordingly, the Bayley (or any developmental measure) is used best as a measure of current developmental functioning, rather than as a predictor of later functioning. To monitor the developmental progress of a child who is delayed or at increased risk of being delayed, the child should be assessed repeatedly over time. This is especially true for children who receive intervention, so that the effects of the intervention can be assessed over time. The Gesell Developmental Schedules was one of the earliest stan- The Bayley Scales of Infant Development for children ages 2 to 30 months. The this age from 1 to 42 months. Psychomotor Scales. The Infant Behavior Record of the original Bayley describes the child’s behavioral style during the evaluation, including social orientation, cooperation, fearfulness, tension, emotional tone, object orientation, goal directedness, attention span, endurance, activity, and reactivity. This scale was revised completely in the Bayley II to provide a systematic 5-point scoring system for all items, with the goal of facilitating scoring and interpretation and improving reliability. An additional feature of the Bayley II is the inclusion of four facets (Cognitive, Language, Personal/ Social, and Motor Quality) designed to help interpret performance on the Mental and Motor Scales. Items on both scales are grouped into one of the facets (based on item-facet correlations), with a developmental age calculated for each facet. This allows one to look at the profile of strengths and weaknesses within the Mental and Motor Scales. The Bayley has very good reliability, but limited predictive validity until a child is 24 to 30 months of age. Among children younger than 24 months, development primarily is sensorimotor-based, were designed Bayley II extends dardized measures of infant development and was restandardized in 1980 as the Revised Gesell and Amatruda Developmental and Neurologic Examination.’’ This test is administered in about 30 minutes and is applicable to infants from I week to 42 months of age. Items are categorized into Adaptive, Gross Motor, Fine Motor, Language, and Personal-Social skills. A developmental age is calculated for each category, with a developmental quotient calculated based on the developmental age achieved, divided by the chronological age, and multiplied by 100. The Gesell provides useful clinical information; however, the standardization is limited because there are few standardized instructions and no standardized scores, it is based on small normative samples, and there are few data regarding reliability and validity. Psychological Testing Psychological tests such as intelligence tests, achievement tests, or behavior rating scales should be used for preschool to school-age children who have developmental disabilities, learning difficulties, or behavioral or 1995 Downloaded from http://pedsinreview.aappublications.org/ at University of Connecticut on May 26, 2015 34/ CHILD DEVELOPMENT Developmental Testing emotional problems. A comprehensive assessment battery administered by a child psychologist generally is most useful for diagnosing problems in this age range because several factors often interact to produce learning or behavior difficulties. The child’s or adolescent’s intellectual abilities, academic skills, and behavioral functioning, as well as the family context, contribute to any difficulty that the pediatrician may be asked to address. Typical problems in this age range about which parents ask pediatricians include learning disabilities, school failure, hyperactivity or attention deficit disorders, and behavior and emotional disturbances. Any assessment battery should include a measure of intellectual functioning to indicate overall cognitive ability. There are several tests of intelligence; the most widely used ones probably are the Wechsler scales. These include the Wechsler Preschool and Primary Scale of Intelligence-Revised (WPPSI-R) for children 3 years to 7 years, 3 months old; the Wechsler Intelligence Scale for Children III (WISC III) for those 6 to 16 years; and the Wechsler Adult Intelligence Scale-Revised (WAIS-R) for those 16 years to adulthood.’2’4 All of these measures are well-standardized and have wellestablished reliability and validity (primarily for predicting school performance). Each test consists of a Verbal Scale and a Performance (or nonverbal) Scale, with subtests within each domain that measure a variety of skills. Each subtest provides a TABLE. IQ RANGE 130 and of IQ Ranges CLASSIFICATION above Very superior 120-129 Superior 110-119 High average 90-109 Average 80-89 Low 70-79 Borderline 69 and Adapted 342 Classification scaled score (mean = I 0, standard deviation = 3); these are combined to produce a Verbal Intelligence Quotient (VIQ), a Performance IQ (PIQ), and a Full Scale IQ (FSIQ). The IQ scores are standardized based on comparisons with scores earned by same-age peers in the standardization group. They are based on a mean of 100 and a standard deviation of 15 and allow comparisons across ages. Thus, an average IQ score at any age is 100, with the range of 85 to 1 15 falling within I standard deviation of the mean. The Table indicates the range of IQs and the diagnostic category assigned to each. The WISC III often is used in an assessment battery to assess a child for learning problems. The profile of subtest scores in the Verbal and Performance areas and the discrepancy between VIQ and PIQ often are more useful in determining the nature of cognitive difficulties than is the FSIQ. For example, a child who achieves an average FSIQ of 100, but who has a significant discrepancy between VIQ and PIQ (eg, 90 versus 110) can be described as having average intelligence but also a significant difference between his or her verbal and nonverbal processing abilities. Analysis of the child’s profile of scores within the Verbal and Performance areas will identify areas of strength and weakness and suggest areas in need of remediation. The Verbal Scale is comprised of six subtests (Information, Similarities, Arithmetic, Vocabulary, Comprehension, Digit Span) that depend on the below from Average Mentally Wechsler (1991). Deficient child’s receptive and expressive language skills and reflect accumulated knowledge and experience. A child’s performance on several of these subtests is influenced by school experience, including the Information, Arithmetic, and Vocabulary subtests. Other subtests, such as Comprehension and Similarities, require knowledge regarding social norms and comprehension of abstract verbal concepts, respectively. These tasks are likely to be influenced by social stimulation or deprivation. Attention and concentration also are required for successful completion of the Digit Span and Arithmetic subtests, which reflect short-term memory skills. The Performance Scale provides a measure of nonverbal cognitive abilities; little or no verbal response is required by the child. The Scale consists of seven subtests (Picture Completion, Coding, Picture Arrangement, Block Design, Object Assembly, Symbol Search, and Mazes), all of which assess nonverbal reasoning and problem-solving. Other specific skills assessed include visualspatial perception and organization (eg, Picture Completion, Block Design, Object Assembly, Mazes, Symbol Search); visual sequencing (eg, Picture Arrangement, Coding); visual-motor coordination (eg, Coding, Block Design, Object Assembly, Mazes); attention, concentration, and short-term memory (eg, Coding, Picture Completion); and psychomotor speed of processing nonverbal information (eg, Block Design, Object Assembly, Coding, Symbol Search, Mazes). Different profile patterns suggest different types of cognitive impairments. For example, mentally retarded children generally will have depressed scores across all subtests, with no significant discrepancy between the Verbal and Performance domains. Children who have learning disabilities tend to have average or above average FSIQs and may have a significant difference between VIQ and PIQ or significant scatter among the subtests within either scale. Most commonly, these children have lower Verbal scores than Performance scores (consistent with reading, spell- Pediatrics in Review Vol. 16 No. 9 Downloaded from http://pedsinreview.aappublications.org/ at University of Connecticut on May 26, 2015 Septe,nber 1995 . . ACHIEVEMENT ing, or arithmetic disabilities), although the reverse pattern can occur among children who have impaired visual-perceptual abilities (ie, lower Performance than Verbal scores). The subtest scores on the WISC III also can be grouped by index scores, which provide more information about the pattern of cogni- Vol. 16 No. 9 Two of the most common achievement tests used are the Wide Range Achievement Test-3 (WRAT-3) and the Woodcock-Johnson Psycho-Educational Battery-Revised.’ 16 The WRAT-3 is a brief, easily administered test of reading, spelling, and arithmetic that was restandardized in 1993 and has high test-retest reliability and satisfactory validity. It provides standardized scores (mean 100, standard deviation = 15), percentile ranks, and grade equivalents. Scores falling within 1 standard deviation of the mean, or between the 16th and 84th percentile rank, are considered to be within the normal range. The percentile rank achieved indicates the level at which a child performed compared with other children in the normative group. Thus, if a child scored at the 55th percentile, he or she performed as well as or better than 55% of the normative group, but not as well as 45% of the normative group. In terms of grade equivalents, a grade equivalent score within 1 year of actual grade level is considered average, while a grade equivalent 2 or more years below actual grade level is considered delayed. Because the WRAT-3 is a brief measure, it is used best for academic screening, but should not be used alone for planfling remediation. The Woodcock-Johnson is a comprehensive assessment measure that consists of 27 subtests in three categories: Tests of Cognitive Ability, Tests of Achievement, and Tests of Interest Level. The battery is designed for those aged 2 years to adulthood. The Tests of Achievement are the most useful component of the . tive problems. These scores include the Verbal Comprehension Index (comprised of Information, Similarities, Vocabulary, and Comprehension scores), the Perceptual Organization Index (comprised of Picture Completion, Picture Arrangement, Block Design, and Object Assembly scores), the Freedom From Distractibility Index (comprised of Arithmetic and Digit Span scores), and the Processing Speed Index (comprised of Coding and Symbol Search scores). Significant discrepancies among index scores can highlight strengths and weaknesses and clarify learning and attention deficit disorders further. For example, children who have attention deficit disorder and hyperactivity often have lower scores on the Freedom From Distractibility Index than on the other indices. Intelligence tests alone, however, do not provide enough information for diagnosis or the planning of treatment. There is no one type of WISC III profile that applies to all cases of learning disabilities. The discrepancy between ability and achievement probably is the best indicator of learning disability. Therefore, measures of both intellectual functioning and academic achievement are needed to determine whether the child’s intellectual potential is consistent with his or her school performance. For example, a child who has average or above average intellectual abilities but scores below average on tests of achievement is not able to learn information and perform at the level of his intellectual potential. His scores on the achievement and intelligence tests can be compared to determine if this is due to a specific learning disability (eg, in reading, spelling, or math) or to a more pervasive problem of selective attention and concentration. Pediatrics in Review TESTS The Woodcock-Johnson categories: designed for test the WRAT-R, the results are more useful for planning academic remediation. Standard scores (mean = 100, standard deviation = 15), percentile ranks, and grade norms are provided for reading, mathematics, written language, knowledge, and skills. As with other measures, scores within 1 standard deviation of the mean are considered to be within the range of normal, as are scores falling between the 16th and 84th percentile ranks. BEHAVIOR SCALES In addition to measures of intelligence and achievement, behavior rating scales are important in an assessment battery. These scales can indicate the child’s behavioral functioning in the home and school and help interpret the intellectual and achievement measures. Children whose intelligence is average or above and whose school performance is poor may have an attention deficit or an emotional problem that is interfering with school performance. Behavior rating scales can help identify these difficulties. The two scales used most commonly are the Child Behavior Checklist (CBCL) and the Connors Rating Scale.’7’ 18 The CBCL includes parent and teacher forms for children ages 4 to 16 years, a parent form for children ages 2 to 3 years, a youth self-report form for those ages I I to 18 years, and a direct observation form. Items are scored 2, 1, or 0 according to whether they are very true, sometimes true, or not true, and they are plotted on a profile that consists of an Internalizing Scale and is a measure with cognition, achievement, and interest. those ages 2 years to adulthood. battery for assessing school achievement and learning difficulties because their reliability and validity are satisfactory. The Tests of Cognitive Ability are reported to have inadequate construct validity and should not be substituted for other standardized measures of intelligence. The achievement tests require 30 to 40 minutes to administer, and although they take longer to administer than September CHILDDE1EL.OPMENT Testing three It is an Externalizing Scale, based on factor analysis. The raw scores can be converted to T-scores, based on age and gender. T-scores are standardized scores that have a mean of 50 and a standard deviation of 10. T-scores above 70 are considered clinically significant. Reliability and validity data are satisfactory, although ratings are based on parent, teacher, or selfperceptions. 1995 Downloaded from http://pedsinreview.aappublications.org/ at University of Connecticut on May 26, 2015 343 CHILD DEVELOPMENT Developmental Testing The Connors Parent Rating Scale was designed for rating the behavior of children 3 to 17 years of age. It is available in either a long (93 items) or short form (48 items). In addition, an abbreviated 10-item version, referred to as the Hyperactivity Index, is available for screening and follow-up. A teacher rating form, the Connors Teacher Rating Scale, also is available in long (39 items), short (28 items), and abbreviated (10 items) versions for children 4 to 12 years. Items on both the parent and teacher versions are rated 0, 1 2, or 3 if they occur not at all, just a little. pretty much, or very much. Items are grouped into factors (eg, conduct problem, learning problem. psychosomatic problem, hyperactivity, anxiety). Raw scores can be converted into T-scores, with a mean of 50 and a standard deviation of 10. Age by gender normative data are available, with T-scores above 70 considered indicative of behavior problems in each factor domain. Reliability and validity data are satisfactory for the factors measured, although the hyperactivity scale is the component of this scale used most commonly. Rating scales can be particularly useful in the assessment of attention deficit hyperactivity disorder because they can reflect behavior in multiple settings (eg, home, school, after-school care), as well as provide observations by multiple sources (eg, parent, teacher, child care provider). In addition, they provide a useful way to monitor a child’s response to treatment. For example, rating scales that are completed before, during, and after treatment can document changes in behavior as a function of the treatment provided. The major limitation of behavior rating scales is that they are based on perceptions of the rater and so are not totally objective. If the rater (ie, the parent or teacher) is not considered to be reliable, the results cannot be assumed to be accurate. This limitation can be controlled by obtaining multiple ratings by different raters in different settings. Thus, for example, if ratings by a teacher and a child care provider indicate no hyperactivity, but ratings by a parent , 344 endorse hyperactivity, the behavior may be considered situation-specific (ie, at home with parent) rather than an enduring characteristic of the child. This, in turn, will influence the type of treatment recommended. FOLLOW-UP specific areas of developmental functioning. Throughout this process, the pediatrician’s role as advocate for the child and family serves as a bridge to other professionals and services, with the ultimate goal of facilitating the optimal development of the child. ASSESSMENTS Once a comprehensive assessment battery is completed, a diagnosis is formulated, and a treatment plan is designed, follow-up evaluations should be planned. Children who have developmental disabilities, learning disabilities, or behavioral problems should be assessed repeatedly to determine the progress they have made as a result of intervention or as a function of developmental maturation. Results from systematic evaluations can be used to update or revise treatment and to determine the needs for the child’s future (eg, special education, social skills training, vocational training). Summary Pediatricians play a central role in monitoring the development of infants and children during the course of providing well child care. Parents turn to pediatricians for help in determining whether their child has a temporary lag in development, a serious delay or disorder, or a significant behavior problem that should be addressed. With the passage of PL 99457, pediatricians also play a key role in referring children at risk to early intervention services. By employing a strategy of developmental surveillance, with periodic developmental screening, the pediatrician can determine when a child should be referred for more extensive developmental or psychological testing, which will aid in the process of diagnosis and treatment of developmental disabilities and behavioral disturbances. Knowledge of the screening and testing measures used commonly, as well as their limitations, will result in more accurate interpretation of the data derived from such measures. Once delays are diagnosed and treatment is initiated, repeated assessments over time will serve to identify areas in need of continuing intervention while indicating gains made in REFERENCES I . Dworkin P. British and American recommendations for developmental monitoring: the role of surveillance. Pediatrics. I989:84:1000-1010 2. Frankenburg W, Dodds J. The Denver II: a major revision and restandardization of the Denver Developmental Screening Test. Pediatrics. I992;89:9 1-97 3. Ireton H, Thwing E. The Minnesota Child Development Inventory. Minneapolis. Minn: Behavioral Science Systems; 1974 4. Newborg J, Stock J, Wnek L. Battelle Developmental Inventory. Allen, Tex: DLM Teaching Resources; 1984 5. Capute AJ. Palmer FB, Shapiro BK, et al. The Clinical Linguistic and Auditory Milestone Scale: prediction of cognition in infancy. Dev Med Child Neurol. 1986;28: 762 6. Coplan J. ELM Scale: The Early Language Milestone Scale. Tulsa, OK: Education Corporation; 1983 7. Frankenburg W, Coons C. Home screening questionnaire: its validity in assessing home environment. J Pediatr. 1986:108: 624-626 8. Jellinek M, Murphy J. Robinson J. et al. Pediatric symptom checklist: screening school-age children for psychosocial dysfunction. J Pediatr. l988;l 12:201-209 9. Bayley N. Manualfor the Bavlev Scales of Infant Development. Berkeley. Calif: Psychological Corporation: 1969 10. Bayley N. Bavlev Scales of Infant Developmeat. Second Edition. San Antonio, Tex: The Psychological Corporation; 1993 I 1. Knobloch H, Stevens F. Malone A. Manual of Developmental Diagnosis. New York, NY: Harper & Row; 1980 12. Wechsler D. Manualfor the Wechsler Preschool and Prima,’’ Scale of Intelligence-Revised. San Antonio, Tex: The Psychological Corporation; 1989 13. Wechsler D. Manual for the Wechsler Intelligence Scale for Children 111. San Antonio, Tex: The Psychological Corporation; 1991 14. Wechsler D. Manual for the Wechsler Adult Intelligence Scale-Revised. San Antonio, Tex: The Psychological Corporation: 1981 15. Wilkinson G. Wide Range Achievement Test-3 Administration Manual. Wilmington. Del: Jastak Associates, mc; 1993 16. Woodcock RW, Johnson MB. WoodcockJohnson Psycho-Educational BatteryRevised. Allen. Tex: DLM Teaching Resources; 1989 17. Achenbach T. Manual for the Child Behavior Checklist. Burlington, Vt: University of Vermont; 1991 Pediatrics in Review Vol. 16 No. Downloaded from http://pedsinreview.aappublications.org/ at University of Connecticut on May 26, 2015 9 September 1995 CHILD DEVELOPMENT Developmental I 8. Connors K. Connors York, NY: Multi-Health Rating Scales. New Systems. Inc; Testing PIR QUIZ 1989 9. SUGGESTED READING Committee on Children With Disabilities. Screening for developmental disabilities. Pediatrics. l986;78:526-528 Frankenburg W, Fandal A, Thornton S. Revision of Denver prescreening developmental questionnaire. J Pediatr. l987;1 10: 653-657 Gibbs E, Teti D. interdisciplinary Assessment of Infants. Baltimore, Md: Paul H. Brookes Publishing Co; 1990 Glascoe FP. Byrne KE, Ashford LG. et al. Accuracy of the Denver II in developmental screening. Pediatrics. l992;89: 1221-1225 Meisels S. Can developmental screening tests identify children who are developmentally at risk? Pediatrics. l989;83:578-585 Sattler J. Assessment of C’hildren ‘s Abilities. San Diego, Calif: Jerome M. Sattler, Publisher; 1988 Strangler SR. Huber CJ, Roth DK. Screening Growth and Development of Preschool Children: A Guide for Test Selection. New York. NY: McGraw-Hill; 1980 10. When a test report indicates that a child has an intelligence quotient or developmental quotient between 85 and 1 15 and that the child, therefore, is ‘ ‘average,’ ‘ it will be appropriate to conclude that the child: A. Does not need further testing. B. Has scored within 1 standard deviation of the average (mean) score for a reference group. C. Is neither dull nor bright. D. Is normal. Among the following screening tests, the one that appears to be best for the evaluation of school readiness is the: A. Battelle Developmental Inventory Screening Test. B. Denver Developmental Screening Test (DDST II). C. Early Screening Inventory (ES!). D. Minnesota Child Developmental Inventory (MCDI). I 1 . When a developmental test standardized on a group of children drawn from an affluent community is applied to children from a lower middle class or economically disadvantages group. the test likely will suffer most importantly from impaired: A. Reliability. B. Sensitivity. C. Specificity. D. Validity. 12. A test that gives widely divergent results on early repetition with a child who appears to be in the same clinical sions may A. Reliability. B. Sensitivity. C. D. Specificity. Validity. I 3. The test state have ability with can identify in a clinical on both occa- low: which a screening all affected persons sample is a measure of the test’s: A. Reliability. 14. B. Sensitivity. C. D. Specificity. Validity. The ability identify clinical IS. of a screening test to all nonaffected persons in a sample is a measure of its: A. B. Reliability. Sensitivity. C. Specificity. D. Validity. The ability of a test to measure what it purports to measure is an aspect of its: A. B. Reliability. Sensitivity. C. Specificity. D. Validity. IN BRIEF Tongue-tie: The Tongue. Human Anomalies. Management Gorlin Malformations Volume Ri, Sedano HO. of a Short In: amid Related II. Stevenson RE, Hall JG, Goodman RM, eds. New York, NY: Oxford University Press; 1993:401-403 Tongue-Tie. Catlin Fl, De Haan V. Arch Otolarvngol. Assessment I 97 1 :94:548 -557 of Lingual Function When Ankyloglossia (Tongue-tie) Is Suspected. Williams WN, Waldron MM. JADA. 1985; I 10:353-356 Sublingual Dimensions in lnfants and Young Children. Fletcher SG. Daly DA. Arch Otolarvngol. I 974;99:292-296 Neonatal Frenotomy May Be Necessary to Correct Breast Feeding Problems. Marmet C, Shell E, Marmet R. J Human Lact. 1990; 6:117-120 Tongue-tie, or ankyloglossia, historically has been believed to cause speech defects, as well as breastfeeding difficulties and dental problems. St. Mark wrote, “The string of his tongue was loosed and he spoke Pediatrics in Review Vol. 16 No. 9 September Sublingual Frenulum plain,” and midwives in the 15th century reportedly kept a fingernail sharp to cut the frenula of all newborns in an attempt to prevent possible speech problems. Only within the last century has it become acceptable not to perform frenulotomy for children who have ankyloglossia. During early development the tongue is fused to the floor of the mouth. Cell death and resorption free the tongue, with the frenulum left as the only remnant of the initial attachment. Tongue-tie results from a short and thickened lingual frenulum, which restricts (or ties) movements of the tongue. Limitation of movement may vary from very mild to complete fusion of the tongue to the floor of the mouth. Fusion is referred to as complete ankyloglossia. Tongue-tie, really partial ankyloglossia, is defined as a limitation of movement severe enough that notching of the tip of the tongue occurs when an attempt is made to protrude it from the mouth. The incidence of significant tongue-tie has been estimated to be less than 0.5 per 1000. This still should be frequent enough for cornplications to have been reported in the literature, but no definitive picture has emerged of partial ankyloglossia as a cause of speech defects, breastfeeding difficulties, or dental problems. In fact, reviews of the literature generally suggest that ankyloglossia is not the significant cause of speech defects it was believed to be historically. Most speech pathologists feel that partial ankyloglossia rarely inter- 1995 Downloaded from http://pedsinreview.aappublications.org/ at University of Connecticut on May 26, 2015 345 Developmental Testing Kathleen E. Gilbride Pediatrics in Review 1995;16;338 DOI: 10.1542/pir.16-9-338 Updated Information & Services including high resolution figures, can be found at: http://pedsinreview.aappublications.org/content/16/9/338 Permissions & Licensing Information about reproducing this article in parts (figures, tables) or in its entirety can be found online at: http://pedsinreview.aappublications.org/site/misc/Permissions.xhtml Reprints Information about ordering reprints can be found online: http://pedsinreview.aappublications.org/site/misc/reprints.xhtml Downloaded from http://pedsinreview.aappublications.org/ at University of Connecticut on May 26, 2015 Developmental Testing Kathleen E. Gilbride Pediatrics in Review 1995;16;338 DOI: 10.1542/pir.16-9-338 The online version of this article, along with updated information and services, is located on the World Wide Web at: http://pedsinreview.aappublications.org/content/16/9/338 Pediatrics in Review is the official journal of the American Academy of Pediatrics. A monthly publication, it has been published continuously since 1979. Pediatrics in Review is owned, published, and trademarked by the American Academy of Pediatrics, 141 Northwest Point Boulevard, Elk Grove Village, Illinois, 60007. Copyright © 1995 by the American Academy of Pediatrics. All rights reserved. Print ISSN: 0191-9601. Downloaded from http://pedsinreview.aappublications.org/ at University of Connecticut on May 26, 2015