Access to Microdata The Australian Bureau of Statistics

Access to Microdata The Australian Bureau of Statistics

Access to Microdata The Australian Bureau of Statistics Approach Teresa Dickinson [email protected] u This talk... Legislation and policy Access modes Confidentialised unit record files (CURFs) Other

Overseas access to ABS microdata Outside Census and Statistics Act ABS Outputs ABS Outputs High High ABS analysis/Consultancy

Protection Regulation 7A Assist Performance of Statistical functions Section 16A Assist Statistician in carrying out functions ABS On-site Lab Remote access CD-ROM access Specialised

tables Low Published tables Low Detail High Australian Legislation A number of legislative provisions, either directly or indirectly, can facilitate access to microdata

Our legislation allows release of microdata but only in a manner that is not likely to enable the identification of the particular person or organisation to which it relates We can release information about businesses (not individuals) 'to assist the statistician perform statistical functions' - involves collaborations to support the ABS workprogram We can second certain individuals to the ABS to 'assist the Statistician perform statistical functions' Why provide deeper access to microdata? The Benefits Valuable (and high quality) data is under-utilised. Researchers may try to collect substitute data sets

in order to obtain microdata, which is a waste of public resources (to obtain what is probably lower quality data). Government agencies may look to use alternative data providers to obtain survey data for research and analysis purposes, resulting in lower quality data (which may not be as widely accessible) Risks of providing access Misuse - deliberate and inadvertent Lead to beliefs by respondents that researchers have the potential to identify their data, and possibly even use it against them Loss of trust in processes and work of national statistical offices, leading to reduced response rates

A shift in emphasis... From risk avoidance to risk management Production of microdata files from household collections is now routine well developed polices and processes exist Beginning to explore ways of making business microdata more accessible, given that it is rare to be able to produce a confidentialised file Communication with respondents? Engaging with requests for overseas access on a case-by-case basis Policy response - where ABS is heading Four layers of protection

Protection in the data Access method User education / partnership Audit and sanctions Increased variety of access channels CD-ROM, Remote Access Datalab, ABS Datalab, collaborations different combinations but giving the required protection Policy - who gets access, and how Researchers - government or academic - with a particular statistical purpose Undertakings - legally enforceable within Australia won't attempt to identify or match

won't share access etc. will abide by rules in a manual Undertakings made by the institution and individuals who will work with the data Organisational level undertakings approved by a Deputy Australian Statistician Pricing Australian Government agencies must charge for some information products according to a set of guidelines There is recovery of the marginal costs for development and dissemination of CURFs Access to a microdata file is $A1,200 (+10% GST for Australian users)

Policy - creation of files Subject area creates files using a set of rules devised by the methodology area (e.g. standard categories for some variables) Methodologists vet the files, making changes as necessary to 'ensure' confidentiality, and 'declare' that the risks of spontaneous identification are acceptably low The Australian Statistician gives in-principle approval for release of the microdata file What the client sees... One stop shop - all the information about how to access microdata is on our website One client contact point - the CURF Management

Unit (CMU). Submits undertakings through this channel and they provide access once it has been approved Internally however lots of areas involved CMU Subject areas Methodology (assurance of confidentiality and auditing of output) Policy area ABS CURFs ACCESS MODE BASIC

EXPANDED SPECIALIST Less detailed data available for analysis Generally more detailed data available for analysis May provide high level of detail for analysis May include data for collections where previously CURFs could not be produced May allow for integration with

other datasets in a way that does not identify individuals CD-ROM Yes Remote Access Data Lab (RADL) ABS On-site data lab (ABSDL) Yes Yes

Yes Which CURFs? CURFs are available from a range of ABS surveys (68 in total):

Aboriginal and Torres Strait Islander Social Survey Aspects of Literacy Australians' Employment and Unemployment Patterns Business Longitudinal Survey Census of Population & Housing Child Care Survey Disability, Ageing and Carers Survey General Social Survey Household Expenditure Survey Income and Housing Costs Survey Labour Mobility Survey National Health Survey

Mental Health and Wellbeing of Adults Survey Time Use Survey Women's' Safety Survey How Researchers use CURFs University Sector - Ph.D. Students - increasing use - Undergraduate Students -increasing use with the remote access system lecturers set course work as students can access the CURF on line with their individual passwords, less security risk than on CD-ROM Government Departments use CURFs as a basis to understand the population to develop public policy Recent increase in Government Departments using consultants to

do CURF analysis for their purposes. Commercial Research Centres use CURFs to develop models for policy analysis. Examples of work arising from CURFs Ellis, R.P. and Savage, E. (2004) Where do you run after you run for cover? A model of the demand for private health insurance in Australia, Australian Health Economics Conference, Melbourne, November 2004. Cumpston, J. (2004) Models of the Future of Australia, 2004 Australian Population Association Conference. Kok-Wee Ong, The Effect of Literacy on Earnings in Australia, UNSW School of Economics Honours Thesis Richardson, S. Society's Investment in Children,

National Institute of Labour Studies working paper WP151, Flinders University. Remote Access Data Laboratory (RADL) A remote system that allow users to undertake analyses in SAS, SPSS, or SDATA on ABS CURFs Instead of a CD-ROM users get a username and password There are various rules about printing records and detailed tables - but looking at a few records is permitted Output is (electronically) audited. 94% of jobs are returned within 2 minutes - Remaining jobs are manually audited and most are returned within 1 day

A random sample of all jobs are audited Audit Audit is critical to monitor user behaviour All code and output stored Cumulative file of all unit data viewed All jobs have a chance of being inspected Emerging issues Clients require more functionality e.g. Output format to spreadsheet not text Ideally clients would like an interactive system Clients want more detailed data Clients want more business data

Clients want longitudinal data Clients continue to be price sensitive ABS On-site data lab (ABSDL) Secure room and desktop Locked down computer Automatic logging of client activity No data transmitting devices No data or output to enter or leave the room with the client. ABSDL (cont.) Specialist or interactive access to Expanded CURFs More detailed and/or sensitive data

Potential future economic survey data Interactive system SAS, SPSS, STATA, Excel All 8 State & Territory ABS Offices on demand basis Collaborations A way to broaden ABS workprogram by bringing in expertise to 'assist the Statistician with statistical functions' A way of providing access, for selected partners, to business microdata that can't be produced as a CURF Designed to be of use to both ABS and researcher Access is akin to on-site data lab, but data may be close to recognisable (e.g. simply identifiers removed) Still working out processes etc., but they are proving time consuming (and therefore expensive) to establish and run Will never be in the position of undertaking large number of

collaborations Overseas Access - ABS data to other organisations Have a policy Undertakings not legally valid overseas - but we can apply sanctions Access on a project-by-project basis under these conditions project is of genuine benefit to Australian policy making organisation is known to us and trusted access is through RADL (almost always) Processes to apply, pricing etc. are identical to Australian access

Overseas access - international data repositories (e.g. LIS) Challenging! Requires establishment of a genuinely collaborative relationship Processes etc. worked out on a case-by-case basis, but are congruent with our overall policies Detail of data to be released (must) be less than our CURFs

Recently Viewed Presentations

  • Exodus & Wandering in the Wilderness - Eastside church

    Exodus & Wandering in the Wilderness - Eastside church

    Moses' Second Speech. Ten Commandments Reviewed (cont.) II (cont.) God punishes the children for the sins of their fathers to the 3rd and 4th generation; but show love to a 1000 generations of those that love Him and keep His...
  • Chapter 1.1

    Chapter 1.1

    This figure shows the scale over a human cheek epithelial cell. The cell lies between 40 and 60 on the scale, therefore we say it measures 20 eyepiece units in diameter. We will not know the actual size of the...
  • Rosa Parks - WH3

    Rosa Parks - WH3

    Rosa Parks was a poor seamstress. [Due to segregation at this time,] African Americans had to give up seats in front to European Americans and move to the back of the bus. One day on her way home from work...
  • Title should go here

    Title should go here

    Another example The Kensington and Paddington Unit calculated that the annual cost of providing care for a family with nine children, currently on the FSU's caseload, would be in the region of £10,280, considerably in excess of the Unit's £2,000...
  • Title


    Congestive Heart Failure and Cardiomyopathy Mark Bromley PGY-1
  • Smurf Attack - University of Windsor

    Smurf Attack - University of Windsor

    The Test Environment (continued) All the Linux machines had Wireshark installed on them. The attacker had Nemesis installed on it, to generate spoofed IP packets
  • Introduction to Criminal Law

    Introduction to Criminal Law

    causation (continued) q:x shoots y and y dies. is x the factual cause of y's death? in order to be the factual cause of y's death, x's conduct must be a conditio sine qua non of y's death. in other...
  • L'Aquila, Italy April 6, 2009 Earthquake

    L'Aquila, Italy April 6, 2009 Earthquake

    Ham Radio Operators in Response to L'Aquila, Italy Earthquake Presented by Molly Nipper PHYS 401 Uses of Ham Radio April 13, 2009 Emergency service in times of disaster: Monday, April 6, 2009 at 3:32am an Earthquake hit L'Aquila, Italy.