Investigating Privacy Complaints Jennifer Felder1, Jennifer King2, Nick
Investigating Privacy Complaints
Jennifer Felder1, Jennifer King2, Nick Doty2, Prof. Deirdre Mulligan2
North Carolina State University, 2University of California Berkeley School of Information
With recent advances in technology comes an increase in the
quantity of information available in the public domain, which
raises concerns regarding the individuals right to privacy. Our
team is interested in understanding the publics concerns about
information privacy in general. To study this issue, we sought
to identify publicly available data to study. After exploring
several sources, we chose Yahoo! Answers as an initial source
of privacy complaint data because it provided both a useful and
free API and a vast amount of publicly available data that could
be obtained, thus eliminating any violations of personal privacy
that could arise. To collect this data, we wrote a python script
to create a command line executed tool that queries Yahoo!
Answers for specified keywords and stores selected attributes of
questions in a MySQL database. My focus in this team was on
adding command line flags, including additional parameters in
the Yahoo! Answers URL, and creating a cronjob to
automatically run the script.
The flowchart below illustrates the design of the overall script.
The process includes connecting to and querying Yahoo!
Answers for a specified keyword and store the results in a
database. My focus is highlighted in purple.
Refinements of the script, which increased flexibility, autonomy
and the quantity of data collected:
Command line flags
URL Parameters: start, sort
While loop (illustrated below)
While Loop Flowchart
Conclusions and Next Steps
Both types of analysis reveal interesting facts about the data
collected. They demonstrate which keywords are most effective in
retrieving large quantities of questions from Yahoo! Answers.
Furthermore, the more qualitative approach of the Many Eyes
visualization shows not only the most common words appearing in
The flowchart above illustrates the while loop refinement: Yahoo! the questions, but also the relationship of the word searched for
Answers is queried and the start parameter is incremented until within the text to other words in the text analyzed.
an error message from Yahoo! is received.
The next steps for this research include additional natural language
processing and visualizations, like those provided on the Many Eyes
web site. Furthermore, this research contributes to the preliminary
After running the script automatically every two hours for three data collection stage of a larger project being conducted at the School
of Information at UC Berkeley. In the scheme of the project in
days, over seven thousand questions were added to the database.
general, the next steps and final goal are to produce a taxonomy of
I would like to thank the team with which I worked to produce the
command-line tool discussed in this research, consisting of the
following individuals: Christopher Castillo, German Gomez, Rafael
Negron, and Anand Sonkar. In addition, I would like to thank my
graduate student mentors, Nick Doty, MS and Jen King, and my
faculty mentor, Professor Deirdre Mulligan. Finally, I would like to
thank Dr. Kristen Gates, TRUST (The Team for Research in
Ubiquitous Secure Technology), the NSF and UC Berkeley for the
opportunity to conduct this research.
This work was supported by the TRUST Center (NSF award number CCF-0424422)
Type of symmetry found in arthropods. bilateral. ... Which body section has legs and wings? ... Ladybugs, praying mantises, as well as adult and larval lacewings are this type of beneficial insect. A. structural pests. B. producer. C. pollinator. D....
the existence and direction of periodic trends. Plotting TrendsA Periodic Table Activity. Introduction. Materials. Calculator, at least 1 per student group: ... Each group chooses or is assigned one element property: atomic mass, atomic radius, ionization energy, electronegativity, electron ...
Geovisualizing Collections of Penn State University Libraries:a geographical and statistical perspective of use, age, and relevancy. Sherry Roth. MGIS Candidate. Advisor: Dr. J. Blanford. Penn State University
Indicates the "niche" for penguins was well established in the. Southern Ocean by the Eocene. Marine ecosystem must have been highly productive and rich. to support all these species. Another gap in fossil record occurs after this time, until the...
Growth of Nuclear Power for meeting Base Load demands -Opportunities and Challenges for Indian Manufacturing Industries A.K.Balasubrahmanian. Director (Technical) Nuclear Power Corporation of India Limited (A Government of India Enterprise) 11th Nuclear Energy Conclave 2019. 18th October 2019
Map of the Digital Humanities v. 2.1 (8 Sept. 2015) Alan Liu, U. California, Santa Barbara Twitter: @ alanyliu. A map is not a definition. It is a way of moving around and showing others how to get from somewhere...
Ready to download the document? Go ahead and hit continue!