References

Akoglu H. User's guide to correlation coefficients. Turk J Emerg Med. 2018; 18:(3)91-93 https://doi.org/10.1016/j.tjem.2018.08.001

Anrys C, Van Tiggelen H, Verhaeghe S, Van Hecke A, Beeckman D. Independent risk factors for pressure ulcer development in a high-risk nursing home population receiving evidence-based pressure ulcer prevention: results from a study in 26 nursing homes in Belgium. Int Wound J. 2019; 16:(2)325-333 https://doi.org/10.1111/iwj.13032

Bannigan K, Watson R. Reliability and validity in a nutshell. J Clin Nurs. 2009; 18:(23)3237-3243 https://doi.org/10.1111/j.1365-2702.2009.02939.x

Charalambous C, Koulori A, Vasilopoulos A, Roupa Z. Evaluation of the validity and reliability of the Waterlow Pressure Ulcer Risk Assessment Scale. Med Arh. 2018; 72:(2)141-144 https://doi.org/10.5455/medarh.2018.72.141-144

The introduction of the Purpose T Pressure Ulcer Risk Assessment Tool in an acute hospital NHS trust. 2015. https://tinyurl.com/y6pkfkfx (accessed 21 October 2019)

Coleman S, Gorecki C, Nelson EA Patient risk factors for pressure ulcer development: systematic review. Int J Nurs Stud. 2013; 50:(7)974-1003 https://doi.org/10.1016/j.ijnurstu.2012.11.019

Coleman S, Nelson EA, Keen J Developing a pressure ulcer risk factor minimum data set and risk assessment framework. J Adv Nurs. 2014; 70:(10)2339-2352 https://doi.org/10.1111/jan.12444

Coleman S, Smith IL, McGinnis E Clinical evaluation of a new pressure ulcer risk assessment instrument, the Pressure Ulcer Risk Primary or Secondary Evaluation Tool (PURPOSE T). J Adv Nurs. 2018; 74:(2)407-424 https://doi.org/10.1111/jan.13444

Crane N, Pool N, Chang I, Rogan S, Stocker C, Raman S. A dedicated paediatric logistic organ dysfunction score - adjusted pressure injury risk assessment scale is required for tertiary paediatric ICUs. Cardiol Young. 2019; 29:(3)455-456 https://doi.org/10.1017/S1047951118002251

De Meyer D, Verhaeghe S, Van Hecke A, Beeckman D. Knowledge of nurses and nursing assistants about pressure ulcer prevention: a survey in 16 Belgian hospitals using the PUKAT 2.0 tool. J Tissue Viability. 2019; 28:(2)59-69 https://doi.org/10.1016/j.jtv.2019.03.002

European Pressure Ulcer Advisory Panel, National Pressure Ulcer Advisory Panel, Pan Pacific Pressure Injury Alliance. Prevention and treatment of pressure ulcers: quick reference guide. 2014. https://tinyurl.com/y9ow6uce (accessed 21 October 2019)

Feinstein AR. Clinimetric perspectives. J Chronic Dis. 1987; 40:(6)635-640 https://doi.org/10.1016/0021-9681(87)90027-0

Ferrante di Ruffano L, Hyde CJ, McCaffery KJ, Bossuyt PM, Deeks JJ. Assessing the value of diagnostic tests: a framework for designing and evaluating trials. BMJ. 2012; 344 https://doi.org/10.1136/bmj.e686

The RAND/UCLA appropriateness method user's manual. 2001. https://www.rand.org/pubs/monograph_reports/MR1269.html (accessed 23 October 2019)

Fulbrook P, Lawrence P, Miles S. Australian nurses' knowledge of pressure injury prevention and management: a cross-sectional survey. J Wound Ostomy Continence Nurs. 2019; 46:(2)106-112 https://doi.org/10.1097/WON.0000000000000508

Lalkhen AG, McCluskey A. Clinical tests: sensitivity and specificity. BJA Educ. 2008; 8:(6)221-223 https://doi.org/10.1093/bjaceaccp/mkn041

Goodall RJ, Langridge B, Onida S, Davies AH, Shalhoub J. Current status of noninvasive perfusion assessment in individuals with diabetic foot ulceration. J Vasc Surg. 2019; 69:(2)315-317 https://doi.org/10.1016/j.jvs.2018.09.043

Hlavin G, Koenig F, Male C, Posch M, Bauer P. Evidence, eminence and extrapolation. Stat Med. 2016; 35:(13)2117-2132 https://doi.org/10.1002/sim.6865

Kottner J, Balzer K. Do pressure ulcer risk assessment scales improve clinical practice?. J Multidiscip Healthc. 2010; 3:103-111 https://doi.org/10.2147/JMDH.S9286

McShane BB, Gal A, Gelman A, Robert C, Tackett JL. Abandon statistical significance. The American Statistician. 2019; 73:235-245 https://doi.org/10.1080/00031305.2018.1527253

Mervis JS, Phillips TJ. Pressure ulcers: pathophysiology, epidemiology, risk factors, and presentation. J Am Acad Dermatol. 2019; 81:(4)881-890 https://doi.org/10.1016/j.jaad.2018.12.069

Moore ZEH, Patton D. Risk assessment tools for the prevention of pressure ulcers. Cochrane Database Syst Rev. 2019; 1 https://doi.org/10.1002/14651858.CD006471.pub4

Mukaka MM. Statistics corner: a guide to appropriate use of correlation coefficient in medical research. Malawi Med J. 2012; 24:(3)69-71

National Institute for Health and Care Excellence. Pressure ulcers: prevention and management. Clinical guideline CG179. 2014. https://www.nice.org.uk/guidance/cg179/resources (accessed 21 October 2019)

Nixon J, Nelson EA, Rutherford C Pressure UlceR Programme Of reSEarch (PURPOSE): using mixed methods (systematic reviews, prospective cohort, case study, consensus and psychometrics) to identify patient and organisational risk, develop a risk assessment tool and patient-reported outcome quality of life and health utility measures. Programme Grants for Applied Research. 2015; 3:(6)1-630 https://doi.org/10.3310/pgfar03060

Parahoo K. Nursing research : principles, process and issues, 3rd ed. Basingstoke: Palgrave Macmillan; 2014

Qaseem A, Mir TP, Starkey M, Denberg TD Risk assessment and prevention of pressure ulcers: a clinical practice guideline from the American College of Physicians. Ann Intern Med. 2015; 162:(5)359-369 https://doi.org/10.7326/M14-1567

Siedlecki SL, Albert NM. Understanding interrater reliability and validity of risk assessment tools used to predict adverse clinical events. Clin Nurse Spec. 2017; 31:(1)23-29 https://doi.org/10.1097/NUR.0000000000000260

Skivington K, Matthews L, Craig P, Simpson S, Moore L. Developing and evaluating complex interventions: updating Medical Research Council guidance to take account of new methodological and theoretical approaches. Meeting abstracts. Lancet. 2018; 392:(Special Issue) https://doi.org/10.1016/S0140-6736(18)32865-4

Turnbull AE, Dinglas VD, Friedman LA A survey of Delphi panelists after core outcome set development revealed positive feedback and methods to facilitate panel member participation. J Clin Epidemiol. 2018; 102:99-106 https://doi.org/j.jclinepi.2018.06.007

A clinimetric analysis of the Pressure Ulcer Risk Primary or Secondary Evaluation Tool: PURPOSE-T

14 November 2019
Volume 28 · Issue 20

Abstract

The assessment of patients' risk for developing pressure ulcers is a routine and fundamental nursing process undertaken to prevent avoidable harm to patients in all care settings. Many risk assessment tools are currently used in clinical practice, however no individual tool is recommended by advisory bodies such as the National Institute for Health and Care Excellence or the European Pressure Ulcer Advisory Panel. The evidence base on the value of structured risk assessment tools in reducing the incidence or severity of pressure ulcers is poor. This purpose of this article is to provide a clinimetric analysis of the recently developed Pressure Ulcer Risk Primary or Secondary Evaluation Tool (PURPOSE-T) and identify areas for future research to improve the utility of structured risk assessment in identifying patients at risk of developing pressure ulcers.

Pressure ulceration has detrimental impacts on patients both physically and psychologically and is associated with significant economic implications for health services. It is therefore paramount that at-risk patients are identified before significant pressure-related tissue damage occurs in order to effectively implement primary preventive interventions (Mervis and Phillips, 2019). The use of pressure ulcer risk assessment tools (PURAT) in adult patients is highly recommended by the European Pressure Ulcer Advisory Panel (EPUAP et al, 2014), advocated as a ‘consideration’ by the National Institute for Health and Care Excellence (2014), but considered to have no impact on the incidence or severity of pressure ulcers by the Cochrane Collaboration (Moore and Patton, 2019).

The lack of consensus surrounding the value of PURAT indicates a potential lack of evidence for the clinimetric properties of the tools evaluated, specifically features of the tools identified in seminal work by Feinstein (1987): reliability, validity and sensitivity. Notably, the most commonly utilised risk assessment tools—the Waterlow and Braden tools—have been demonstrated to have low sensitivity and specificity in differentiating the levels of risk in patients, potentially limiting their clinical value (Qaseem et, al 2015). This article evaluates the clinimetrics of a PURAT developed by Nixon et al (2015): PURPOSE-T.

Validity

The validity of a tool refers to how effectively it measures the phenomena it was intended to assess (Charalambous et al, 2018). In the case of the PURPOSE-T, validity depends on its accuracy in identifying at-risk patients. The simplest form of validity is face validity, which refers to how much a tool appears to measure what it is intended to measure (Charalambous et al, 2018). Despite face validity being a poor determinant of overall validity of a tool, it is associated with greater compliance because users are more likely to be accepting of its utility—and this may lead to better outcomes for patients (Bannigan and Watson, 2009). This issue is particularly pertinent in evaluating PURAT because it has been suggested that clinical judgement is equally if not more effective than the use of tools; this indicates that a tool that stimulates clinical judgement may improve the accuracy of risk assessments, even if it contains flaws in content or criterion validity (Moore and Patton, 2019).

In an evaluation of PURPOSE-T by Coleman et al (2018) its content validity was assessed via field notes recorded by expert nurses; they noted that it appeared to include important risk factors and that it prompted skin checks by the tool, encouraging more careful skin assessment. A recent review by Anrys et al (2019) on risk factors for pressure ulcer (PU) development supports these observations, noting that daily skin inspection is essential for the timely identification of at-risk patients. Content validity is determined by how effectively the items included in a tool measure the intended outcome. In the case of PURPOSE-T, the content validity would be evidenced by how accurately the included risk factors reflect the true risk of PU development in patient populations (Charalambous et al, 2018). The risks included in the PURPOSE-T tool were drawn from a minimum data set, which was derived from a systematic review of PU risks, combined with consensus judgements from a multiprofessional panel (Coleman et al, 2014). Although the validity of consensus judgements is contentious, the potential for bias was mitigated by Coleman et al (2014) via the use of expertise from a range of specialties, and the risk factors selected for discussion were derived from a systematic review of the literature. Other than participant specialty, there were no other clear threats to the validity of decisions made via group consensus (Turnbull et al, 2018). Arguably, the clinical outcomes yielded following implementation of consensus-based tools are the only way to provide dependable evidence of validity (Fitch et al, 2001).

The systematic review used as a basis for the identification of risks used in the PURPOSE-T consensus panel identified issues in the literature surrounding PU risk identification (Coleman et al, 2013). Large numbers of independent risks were identified in the literature, such as the ‘over interpretation of results from individual studies'; however, three primary risk domains were ultimately identified: mobility, perfusion and skin status (Coleman et al, 2013). The assertion that many studies include too many risk factors, potentially leading to the overprediction of risk, was later supported in an evaluation of the popular Waterlow tool, which it was suggested included too many risk factors (Charalambous et al, 2018). Specifically, the study by Coleman et al (2013) did not find that the factors commonly considered to be indicative of risk—such as gender, age and medications—were significant predictors of risk.

Criterion validity is considered to be an important determinant of diagnostic accuracy in PURAT, and the literature evaluating risk assessment tools most often focuses on this form of validity. However, the different concepts of validity cannot always be differentiated, for example convergent or predictive validity (Kottner and Balzer, 2010). At present, there is no gold standard for assessing PU risk (NICE, 2014), making convergent validity evaluations dependent partly on clinical judgement. Confounding this, intervention following risk assessment may potentially impact the evaluations of predictive validity (Coleman et al, 2018). Due to the complexity of the issues, such as risk assessment for PUs, guidance has been published by the Medical Research Council on the evaluation of complex interventions suggesting that, due to the challenges inherent in the measurement of the clinical impact of certain interventions, consideration should be given to non-experimental forms of evidence to guide clinical practice (Skivington et al, 2018).

Currently, experimental evidence of the validity of PURPOSE-T is limited to the review by Coleman et al (2018), and the tool has not yet been evaluated or included in guidance by NICE or Cochrane. Clough (2015) undertook a local evaluation of the implementation of the PURPOSE-T in an NHS trust in the UK, identifying that is was associated with a 37% decrease in category 2 PUs over a 1-year period; it was also well liked by staff. Although this provides some evidence of face validity and potentially good compliance with the tool, the study was limited by the low sample size of 31.

The consistency of a tool to produce true positive assessments is known as sensitivity, and the consistency with which it produces true negatives is its specificity (Siedlecki and Albert, 2017). Determining the sensitivity and specificity of a tool depends on a definitive clinical outcome to enable determination of whether or not the assessment outcome was correct (Lalkhen and McCluskey, 2008). In the case of PURPOSE-T, this would depend on its ability to assess accurately that no risk is present when it is not, which would be evidenced by a lack of the development of a PU, and, on the contrary, the development of one where risk was assessed to exist. However, collecting accurate data on this presents ethical challenges because patients (without PUs) who are assessed as being at risk should be offered primary preventive interventions (NICE, 2014) and may therefore not present with a PU, indicating an accurate assessment of risk.

The EPUAP (2014) has advised that reliance should not be placed solely on PU risk assessment so it is possible that changes in clinical status may lead to the implementation of preventive interventions without formal reassessment of PU risk. In the event of patients becoming critically unwell, potentially drastic changes in risk for pressure ulceration may occur due to changes in a patient's mobility, perfusion and medical device-related risk. Notably, risk factors associated with changes in declining clinical status, mobility and perfusion, were associated with the lowest reliability scores in the PURPOSE-T evaluation by Coleman et al (2018). This may affect the accuracy of data collected on the sensitivity or specificity of tools such as PURPOSE-T. These may have been accurate at the time they were completed but, due to changes in patients' clinical status, they may no longer reflect the true risk for PUs.

Ultimately, evidencing the specificity and sensitivity of risk assessment tools remains methodologically challenging and the clinical value of tools such as PURPOSE-T with regard to their predictive value currently depends on long-term outcome data, which is yet to be gathered (Ferrante di Ruffano et al, 2012). Early adopters of the tool, such as Clough (2015), demonstrated that, following implementation of PURPOSE-T there was a significant reduction in PU incidence, which may indicate the high sensitivity of the tool in identifying at-risk patients, allowing effective preventive measures to be implemented. It has recently been argued that traditional indicators of statistical significance (specifically P values) should be abandoned, particularly in complex clinical trials that are difficult to reproduce (McShane et al, 2019). This is due to the frequently erroneous relationship of P values with clinical realities, combined with clinicians' dependence on them in the evaluation of trials (McShane et al, 2019). The issues surrounding statistical evaluations of the predictive value of PURPOSE-T are complicated by an ongoing lack of a gold standard tool for PU risk assessment, making comparisons with other tools limited in their indication of effectiveness (Siedlecki and Albert, 2017).

Coleman et al (2018) compared the convergent validity of the PURPOSE-T against that of the Waterlow and Braden tools. Notably, in comparison with these commonly used tools, medium to strong phi correlation coefficients were demonstrated with regard to determining PU risk, as well as identifying risk factors common to the three tools. The clinical significance of these findings is difficult to assess because many of the factors used in earlier tools were removed in PURPOSE-T following the initial systematic review (Coleman et al, 2014). Additionally, the data analysed using the phi coefficients rely on a normal distribution of data (or risk) (Mukaka, 2012), which may not necessarily be present in the populations that were assessed in the study, who may be already be considered ‘at risk’ due to being in an acute care setting. It is also important to note that, in Coleman et al's (2018) study, the assessment data were incomplete, which may have affected the statistical analysis of the correlation between PURPOSE-T and the other two assessment tools. In addition, the correlation coefficients calculated may not necessarily reflect the consistency of the tools to predict risk, but may simply reflect commonalities in the subjective judgement between users of the tool(s), which cannot be demonstrated statistically because there is no gold standard assessment with which to determine the presence of complex issues such as risk (Akoglu, 2018).

Children, psychiatric and critically unwell patients were excluded from the study by Coleman et al (2014), although these groups cumulatively represent a large proportion of the population and are considered to be particularly vulnerable to developing PUs (Crane et al, 2019). Based on the lack of testing, it could be argued that the validity of PURPOSE-T in the excluded patient groups is impossible to determine. However, ethical challenges surrounding medical research in these patient groups due to their vulnerable status can prevent effective experimental studies being completed to determine efficacy of tools such as PURPOSE-T (Hlavin et al, 2016). This lends further credence to recent assertions that experimental evidence is perhaps limited in its contribution to the evidence surrounding complex concepts such as PURAT (Skivington et al, 2018).

Reliability

The reliability of a risk assessment tool refers to the consistency of different users to obtain the same outcome using the same tool. Inter-rater reliability scores rely on comparison of multiple individuals using the same tool on the same patient at the same time, yielding data that helps to determine reliability (Siedlecki and Albert, 2017).

Coleman et al (2018) demonstrated good inter-rater reliability in 230 three-way paired assessments that involved ward, community and expert nurses. Overall, the data gained by Coleman et al (2018) appeared to provide compelling evidence of the reliability of PURPOSE-T. However, subcategories within the assessment showed poor reliability, specifically perfusion status (65.4% agreement), sensory perception (79.1%) and mobility assessment (59.2%). Although the overall reliability score of the tool was high, these three categories fall below the desired 80% agreement, which indicates inadequacies either in the training of the users or of the tool itself (Siedlecki and Albert 2017).

Nurses' limited knowledge of PUs has created difficulties internationally in the provision of PU care. In addition, poor knowledge has been associated with attitudes focused on treatment rather than prevention, poor understanding of risk factors and reduced multidisciplinary input (De Meyer et al, 2019; Fulbrook et al, 2019). This lack of knowledge may explain the poor inter-rater reliability scores in the identified three domains, which may arguably require greater knowledge to assess correctly. An alternative explanation for the reliability issues is a lack of effective methods to assess the domains that have poor agreement. Specifically, mobility assessment may be hindered by poor communication with patients (Coleman et al, 2018).

Current NICE (2014) guidance on PU risk assessment recommends the assessment of perfusion via observation of changes in skin colour and the presence of non-blanching erythema, which are known to be subject to observer bias (Parahoo, 2014). Arguably, these subjective assessment methods may be effective only once an injury exists and is of little value in assessing risk. The poor reliability of these aspects of PURPOSE-T may reflect the general poor clinimetric properties of the methods used to assess these specific issues (perfusion, mobility and sensory perception). Perfusion, in particular, is notoriously difficult to assess using non-invasive techniques (Goodall et al, 2019) in patients with poor communication (Coleman et al, 2018); it is also difficult to assess in situations where there is a lack of multidisciplinary input and therefore no alternative or expert input to aid the assessment of these more challenging risk factors (Fulbrook et al, 2019).

These issues may be confounded by the prevalent perception that PU management is strictly the domain of nurses, who have demonstrably poor knowledge of PU management (De Meyer et al, 2019). This lack of knowledge may prevent the timely involvement of the multidisciplinary team and, ultimately, the provision of effective patient care (De Meyer et al, 2019). The reliability of PURPOSE-T may be improved by the use of assessment methods that are less subjective and by improving the clarity of the language used in some of the sections, such as the mobility assessment; this would help to produce more accurate and reliable results.

Conclusion

Risk assessment tools for PU prevention remain dependent on the clinical judgement of the staff using the tools and a gold standard is yet to be developed (Moore and Patton, 2019). A large-scale evaluation of PURPOSE-T determined that it has good face validity as reported by experts and non-experts (Coleman et al, 2018). This correlates with the findings of the earlier study by Clough (2015), who reported that the tool was well accepted by staff based on its usability and content. The content validity of the tool was determined by a combination of data from a systematic review and the conclusions of a consensus meeting (Coleman et al, 2014). This controversially eliminated many risk factors previously considered to be important in determining the risk for PUs (Coleman et al, 2014), ultimately creating challenges in assessing convergent validity due to the dissimilarities between PURPOSE-T and older tools. This was confounded by statistical assumptions surrounding the distribution of risk factors within the study (Mukaka, 2012).

The inter-rater reliability of PURPOSE-T was demonstrably poor on three subdomains: perfusion status, mobility and sensory perception (Coleman et al, 2018). It is likely that this reflects clinimetric issues with the methods used to assess these specific risks and may be confounded by users' limited knowledge (Siedlecki and Albert, 2017). Overall, the tool appears to produce reliable and consistent assessment outcomes between experts and non-experts (Coleman et al, 2018). Determining the sensitivity and specificity of PURPOSE-T remains difficult, due to the ethical considerations inherent in obtaining definitive outcomes with which to compare initial risk assessments (Lalkhen and McCluskey 2008). Further research is needed to provide robust evidence on the clinimetric values of PURPOSE-T, including studies with patient populations in which it has not been tested, for example, children, critically ill individuals and psychiatric patients.

KEY POINTS

  • Systematic pressure ulcer risk assessments are currently recommended by the National Institute for Health and Care Excellence and the European Pressure Ulcer Advisory Panel. However, the evidence base supporting their use remains poor
  • Risk assessment tools used in clinical practice have been demonstrated to have low sensitivity and specificity
  • The Pressure Ulcer Risk Primary or Secondary Evaluation Tool (PURPOSE-T) has been validated for use in adult populations, but not in critically unwell, paediatric or psychiatric populations
  • The inter-rater reliability of PURPOSE-T was demonstrably good in most domains, but not for assessing perfusion status, mobility or sensory perception, although this is likely to reflect clinimetric issues with the methods used to assess these risk domains
  • CPD reflective questions

  • What are the key challenges associated with accurately assessing patient risk for pressure ulcers in clinical practice?
  • What are the implications for patients and health services of ineffective preventive action following the identification of pressure ulcer risk?
  • How could you improve the abilities of members of your team to identify risk for pressure ulcers in your clinical area?