Writing software or writing scientific articles? Maria Grazia

Writing software or writing scientific articles? Maria Grazia

Writing software or writing scientific articles? Maria Grazia Pia INFN Genova, Italy T. Basaglia (CERN), Z. Bell (ORNL), P. Dressendorfer (IEEE), A. Larkin (IEEE), other authors Please let me know if you wish to be in the authors list IEEE Nuclear Science Symposium 2007 Honolulu, HI, USA Physics Today, March 2004, 61-62 Do software-oriented physicists follow similar publication patterns as their hardware-oriented colleagues? Are there any different habits of software-oriented publication in HEP and other radiation physics disciplines? No scientometric study on this topic yet Maria Grazia Pia INFN Genova Background 1997 Photo courtesy of Fermilab archive 1987 Maria Grazia Pia INFN Genova 2007 Data analysis Main source of data

ISI Web of Science (covers year >1990) Google Scholar (HEP experiments year < 1990) Publisher web site and search engines (Elsevier Science Direct, IEEExplore etc.) Internal editorial data IEEE TNS (thank you!) Detailed analyses cover years 2002-2006 Citation searches: 1990-today (ISI Web of Science coverage) Automated searches But manual inspection of a partial sample at least : avoid blind analysis! Introduction of noise: background evaluation to be refined Manual scan for paper classification In many cases no other way to evaluate the pertinence of papers Some degree of subjective evaluation (1-10%) Conservative bias: assign to software in case of sw/hw ambiguity Cross checks with other databases (INSPEC, CDS etc.) For a few samples Maria Grazia Pia INFN Genova HEP experiments How does software-oriented HEP literature production compares to hardware-oriented one? A set of reference HEP experiments LEP, LHC, Tevatron, PEP-II, HERA, fixed target, astroparticle Apologies to those not included in the statistics: no judgment of merit! Publications on technical journals only Exclude papers on physics results Hardware Software Trigger/DAQ More hardware-oriented in the early days (LEP era) More software-oriented nowadays (LHC era) Manual scan (~ 300 papers/experiment at most) Maria Grazia Pia INFN Genova

Hardware vs software papers in HEP LEP: full experimental life-cycle ALEPH, DELPHI, L3, OPAL Fixed target NA48 In between: LHC: the new generation CDF, ZEUS, BaBar ALICE, ATLAS, CMS, LHCb Astroparticle: LNGS, GLAST Labs CERN, DESY, FNAL, LNGS, SLAC Hardware/software ratio in HEP experiments 20 HEP experiments: technical publications 15 350 10 300 5 0 250 ALEPH DELPHI L3 OPAL CDF ZEUS

BaBar ALICE ATLAS CMS LHCb NA48 LNGS 200 Hardware 150 DAQ Software 100 50 0 ALEPH DELPHI L3 Maria OPAL Grazia CDF ZEUS Pia BaBar INFN Genova ALICE ATLAS CMS LHCb

NA48 LNGS GLAST HEP technical publications Most popular journals HEP technical publications: journals LEP CDF BaBar LHC ZEUS NA48 LNGS GLAST 1200 1000 800 600 400 200 0 NIM TNS Comp. Phys. Comm. IEEE Magn. Maria Grazia Pia INFN Genova IEEE Appl.

Supercond. Other Grid computing The big hype in HEP nowadays Not only in HEP Large investments (funds, manpower) Large literary production (2002-2006) Grid/distributed computing journals: 4572 papers NIM A + IEEE TNS: 10386 papers What are the publication trends in this active computing domain? Where does HEP stand in the picture? Maria Grazia Pia INFN Genova Grid computing: top 10 institutes Grid computing: top 10 institutes 3.5% Shanghai Jiao Tong Univ 3.0% Chinese Acad Sci Argonne Natl Lab 2.5% Source: ISI Web of Science 2002-2006 UC San Diego 2.0% Zhejiang Univ CERN 1.5% Nanyang Technol Univ

1.0% Univ Melbourne 0.5% CNR 0.0% Univ Oxford Geographical distribution of publications Grid computing: regions 600 500 400 300 200 100 0 Africa Latin America N. America Asia Maria Australia Europe Grazia Pia INFN Genova Russia Ukraine All types of publications: journals, proceedings

Conference proceedings Grid - Top 10 institutes, Proceedings 0 5 10 15 20 25 30 35 40 45 50 Shanghai Jiao Tong Univ Chinese Acad Sci Zhejiang Univ Natl Univ Def Technol Tsing Hua Univ Institute s Grid - Top 10 countries, Proceedings 0 Huazhong Univ Sci & Technol 50 100 150

200 250 300 350 China USA Nanyang Technol Univ South Korea Countries UK Northeastern Univ Germany Singapore Korea Univ Australia Italy Xian Jiaotong Univ Taiwan Spain Computing journals Grid - Top 10 institutes, Computing journals 0 5 10 15

20 25 30 35 40 45 50 Univ Calif San Diego Institute s Argonne Natl Lab Chinese Acad Sci Univ Tennessee Grid - Top 10 countries, Computing journals AGH Indiana Univ Univ Illinois Univ Amsterdam Univ Texas Ohio State Univ 0 50 100 150 200 250

300 USA UK Germany China France Italy Spain Japan Netherlands Maria Grazia Pia INFN Genova Australia Countries 350 Different publication habits US/EU academic environment Asian univ. Where is HEP? Grid computing plays a major role in LHC experiments HEP labs/institutes play leading roles in grid development Computing journals + IEEE TNS Grid - Top 10 institutes, Computing journals + TNS 0 5 10 15 20 25

30 35 40 45 50 Univ Calif San Diego CERN Argonne Natl Lab INFN Chinese Acad Sci Univ Tennessee Univ Illinois Grid - Top 10 countries, Computing journals + TNS 0 50 100 150 200 250 300 350 USA UK Italy AGH Univ. Sci. & Technol. Germany France Indiana Univ

China Countries Spain FNAL Japan Switzerland IEEE TNS makes the difference! No regular paper on grid-computing NIM (only in NIM-proceedings) Maria Grazia Pia inINFN Genova Simulation - Monte Carlo One of the main areas of software contribution to experimental physics research Event generators Particle transport Software Core system developers Application developers Detector Physics Application users Which domains for simulation papers ? Maria Grazia Pia INFN Genova Monte Carlo codes Statistics in ISI Web of Science, 2002-2006 Monte Carlo codes Mixed sample

Geant4: citations Others: word search 1000 900 800 700 600 500 400 Includes GEANT-FLUKA (11%) Beware: often Geant4 is mentioned as GEANT in published papers 300 200 100 0 EGS FLUKA GEANT Geant4 MCNP Penelope Maria Grazia Pia INFN Genova EGS FLUKA GEANT Geant4 MCNP Penelope 250 Journals where mentioned 200

General Medical Radiation Protection Nuclear 150 100 50 A large fraction of Monte Carlo literature is published in medical physics and radiation protection journals 0 NIM A Med. Phys Radiat. Prot. Dosim. Phys. Med. IEEE TNS Biol. Appl. Rad. Isot. NIM B Fus. Eng. Des. Ann. Nucl. En. Health Phys.

Top 5 Monte Carlo categories (defined by journal category) 1200 Journal categories 1000 800 EGS FLUKA GEANT 600 Geant4 MCNP Penelope 400 200 0 Maria Grazia Pia INFN Genova HEP Monte Carlo papers represent only a fraction of NIM Monte Carlo papers (all classified as HEP) Monte Carlo / Simulation Distribution of articles across experimental topics ISI Web of Science, 2002-2006 Monte Carlo / Simulation Unclassified LHC Astroparticle

1100 Medical-RadProt Nuclear Accelerator DAQ Trigger Other disciplines publish more papers on Monte Carlo / Simulation than HEP 1000 900 800 700 600 500 400 300 200 100 0 NIM A NIMB IEEE TNS Med. Phys. PMB Health Phys. Maria Grazia Pia Radiat. Prot. Dosim. INFN Genova Appl. Rad. Isot.

Fus. Eng. Des. Ann. Nucl. En. Computing - Software Generic keyword search: too noisy Restrict search to a subset of technical journals Computing + software + algorithm + Monte Carlo + simulation Still some noise introduced in the sample Some software papers not retained by the selection Comp. Phys. Comm.: 62% sample retained Fraction of CPC missed: mostly theoretical, non-radiation physics Tests with other keyword searches do not modify the conclusions substantiall Better check needed for TNS on noise introduced Sample selected: mostly detector application papers Maria Grazia Pia INFN Genova Software - Computing Keyword search in ISI Web: software + computing + algorithm Top 10 Nuclear Technology journals Periods: > 1990 and 2002-2006 Software Computing Algorithm in top 10 Nuclear Technology journals >1990 >1990 2002-2006 Software Computing Algorithm in top 10 Nuclear Technology journals >1990 >1990 2002-2006 Dominated by TNS NIM A/B 0 IEEE TNS J. Fusion En. Int. J. Radiat. Biol.

200 400 600 800 1000 1200 0.0% IEEE TNS J. Fusion En. Int. J. Radiat. Biol. J. Nucl. Mat. J. Nucl. Mat. NIM A NIM A Radiochim Acta Radiochim Acta NIM B NIM B Appl. Radiat. Isot. Appl. Radiat. Isot. Radiat. Meas. Radiat. Meas. Health Phys. Health Phys.

Maria Grazia Pia INFN Genova 5.0% 10.0% 15.0% Citation statistics Not necessarily the best metric of scientific relevance but widely used (journal impact factor) Most cited papers in HEP labs/institutes CERN, INFN, other labs Most cited papers in selected technology journals NIM A, TNS, Med. Phys., Phys. Med. Biol., Rad. Prot. Dos. Most cited papers in top 10 Nuclear Technology journals 1. 2. 3. 4. 5. IEEE Trans. Nucl. Sci. J. Fusion En. Int. J. Radiat. Biol. J. Nucl. Mat. NIM A 6. Radiochim Acta 7. NIM B 8. Appl. Radiat. Isot. 9. Radiat. Meas. 10.Health Phys. Where do software papers stand? Maria Grazia Pia INFN Genova 81269 papers in total Most cited papers - CERN

1. Sjostrand T High-energy-physics event generation with Pythia-5.7 and Jetset-7.4 Comp. Phys. Comm. 82 (1): 74-89 Aug 1994 Times cited: 1835 2. Antoniadis I A possible new dimension at a few TeV Phys. Lett. B 246 (3-4): 377-384 Aug 30 1990 Times Cited: 981 3. Amaldi U, Deboer W, Furstenau H Comparison of grand unified theories with electroweak and strong coupling-constants measured at LEP Phys. Lett. B 260 (3-4): 447-455 May 16 1991 Times cited: 801 4. Agostinelli S, et al. GEANT4 - a simulation toolkit NIM A 506 (3): 250-303 Jul 1 2003 Times cited: 657 Maria Grazia Pia INFN Genova 1. Most cited papers - INFN Gammaitoni L et al. Stochastic resonance Rev. Mod. Phys. 70 (1): 223-287 Jan 1998 Times cited: 1574

2. Marchesini G et al. HERWIG 5.1 - A Monte-Carlo event generator for simulating hadron emission reactions with interfering gluons Comp. Phys. Comm. 67 (3): 465-508 Jan 1992 Times cited: 999 3. Abe F et al. Observation Of top-quark production in (p)over-bar-p collisions with the Collider Detector at Fermilab Phys. Rev. Lett. 74 (14): 2626-2631 Apr 3 1995 Times cited: 739 4. HEP paradox? Agostinelli S et al. GEANT4-a simulation toolkit NIM A 506 (3): 250-303 Jul 1 2003 Times cited: 657 Few software publications but software articles are most cited (much more than hardware ones!) Maria Grazia Pia INFN Genova How does it compare to other labs? FNAL No software papers among the 100 most cited ones DESY Software paper in 4th rank of DESY most cited ones Lonnblad L

ARIADNE Version 4 - a program for simulation of QCD cascades implementing the color dipole model Comp. Phys. Comm. 71 (1-2): 15-31 AUG 1992 Times Cited: 427 LLNL Most cited software paper: 88th Prestridge DS Signal scan - a computer-program that scans DNA-sequences for eukaryotic transcriptional elements Computer Applications in the Biosciences 7 (2): 203-206 APR 1991 Times Cited: 325 Maria Grazia Pia INFN Genova Most cited papers: NIM A 1. Agostinelli S et al. GEANT4-a simulation toolkit 2. NIM A 506 (3): 250-303 Jul 1 2003 Times Cited: 663 Radford DC Top two: software! ESCL8R and LEVIT8R - Software for interactive graphical analy sis of HPGe coincidence data sets NIM A 361 (1-2): 297-305 Jul 1 1995 Times Cited: 491 3. Kubota Y et al. The CLEO-II detector 4. NIM A 320 (1-2): 66-113 Aug 15 1992 Times Cited: 453 Adeva B, et al.

The construction of the L3 experiment 5. NIM A 289 (1-2): 35-102 Apr 1 1990 Times Cited: 450 Ahmet K Large-scale HEP detectors The OPAL detector at LEP NIM A 305 (2): 275-319 Jul 20 1991 Times Cited: 442 Maria Grazia Pia INFN Genova Most cited papers: IEEE TNS 1. Cherry SR et al. MicroPET: A high resolution PET scanner for imaging small animals IEEE Trans. Nucl. Sci. 44 (3): 1161-1166 Part 2 Jun 1997 Times Cited: 234 2. Melcher CL, Schweitzer JS Cerium-doped lutetium oxyorthosilicate - a fast, efficient new scintillator IEEE Trans. Nucl. Sci. 39 (4): 502-505 Aug 1992 Times Cited: 189 3. Strother SC, Casey ME, Hoffman EJ 4. IEEE Trans. Nucl. Sci. 37 (2): 783-788 Part 1 Apr 1990 Times Cited: 167 Summers GP et al. Measuring pet scanner sensitivity - relating countrates to image signal-to-noise ratios using noise equivalent counts

Damage correlations in semiconductors exposed to gamma-radiation, electr on-radiation and proton-radiation IEEE Trans. Nucl. Sci. 40 (6): 1372-1379 Part 1 Dec 1993 Times Cited: 160 5. Hoffman EJ et al. 3-D phantom to simulate cerebral blood-flow and metabolic images for PET IEEE Trans. Nucl. Sci. 37 (2): 616-620 Part 1 Apr 1990 Times Cited: 134 Maria Grazia Pia INFN Genova Most cited papers: Med. Phys. + Phys. Med. Biol. 1. Nath R,et al. Dosimetry Of Interstitial Brachytherapy Sources - Recommendations Of The AAPM Radiation-Therapy Committee Task Gro up No 43 Med. Phys. 22 (2): 209-234 Feb 1995 Times Cited: 610 2. Rogers DWO et al. Beam - A Monte-Carlo Code To Simulate Radiotherapy Treatment Units Med. Phys. 22 (5): 503-524 May 1995 Times Cited: 391 3. Studholme C, Hill DLG, Hawkes DJ Automated Three-Dimensional Registration Of Magnetic Resonance And Positron Em ission Tomography Brain Images By Multiresolution Optimization Of Voxel Similarity Measures Med. Phys. 24 (1): 25-35 Jan 1997 Times Cited: 305

4. Farrell Tj, Patterson MS, Wilson B A Diffusion-Theory Model Of Spatially Resolved, Steady-State Diffuse Reflectance Fo r The Noninvasive Determination Of Tissue Optical-Properties Invivo Med. Phys.19 (4): 879-888 Jul-Aug 1992 Times Cited: 300 5. Gabriel S, Lau RW, Gabriel C The dielectric properties of biological tissues .2. Measurements in the frequency rang e 10 Hz to 20 GHz Maria Grazia Pia INFN Genova Top 10 Nuclear Technology journals 1. Agostinelli S et al. GEANT4-a simulation toolkit NIM A 506 (3): 250-303 Jul 1 2003 Times Cited: 663 2. 657 663 Grown while preparing the slides Ahlbom A et al. Guidelines for limiting exposure to time-varying electric, magnetic, and electromagnetic fields (up to 300 GHz) Health Phys 74 (4): 494-522 Apr 1998 Times Cited: 547 3. Murray AS, Wintle AG

Luminescence dating of quartz using an improved single-aliquot regenerative-dose protocol Radiat. Meas. 32 (1): 57-73 Feb 2000 Times Cited: 499 4. Radford DC ESCL8R and LEVIT8R - Software for interactive graphical analysis of HPGe coincidence data sets NIM A 361 (1-2): 297-305 Jul 1 1995 Times Cited: 491 5. Kubota Y et al. The CLEO-II detector NIM A 320 (1-2): 66-113 Aug 15 1992 Times Cited: 453 Maria Grazia Pia INFN Genova Who cites Geant4? Geant4 citations - Top 10 journals 0 20 40 60 80 ~72% total citations 100 120 140 NIM A Phys. Rev. D

IEEE TNS Phys. Rev. Lett. Med. Phys. Phys. Med. Biol. Phys. Rev. C NIM B J. Phys. G Phys. Lett. B Technology journals 46% of top 10 HEP physics 33% of top 10 Medical physics Nuclear physics 14% of top 10 5% of top 10 Maria Grazia Pia INFN Genova Who does not cite Geant4? (but mentions it in the paper) Geant4 references 2005-2006 70% 60% 50% 40% TNS 30% NIM A 20% 10% 0% Missing Wrong Incomplete OK

Scientific software is not commonly perceived as academic research deserving to be cited Maria Grazia Pia INFN Genova Meditations Maria Grazia Pia INFN Genova and action Computing & Software is the largest track (# abstracts) at this conference It was the largest last year too, but few software papers presented at the conference were followed by journal submission Proceedings are not the same as publication in a refereed journal! IEEE TNS Highest impact factor in its category Welcomes software-related papers our hardware-oriented colleagues give us a good example! Manuscript type for software papers: Instrumentation Maria Grazia Pia INFN Genova

Recently Viewed Presentations

  • Flexible Learning  Peter LeCornu  Dean, Faculty of Business

    Flexible Learning Peter LeCornu Dean, Faculty of Business

    Peter LeCornu Dean, Faculty of Business and Information Technology, Canberra Institute of Technology Program Director, Communication and Leadership, Australian Flexible Learning Framework
  • Merimbula PS Ski Excursion

    Merimbula PS Ski Excursion

    Ms Michelle Hulme (Coordinating Executive Teacher) Mr Craig Morris. Mr Shane Doherty OR Mrs Amanda Fowler. Miss Natasha Sconfienza. Mr Ben Kirk (for Miss Stafford) Miss Stafford Teach 6K. Year 4 Teacher teach Stage 2. If a lot of 6K...
  • Cls 223 - جامعة الملك سعود

    Cls 223 - جامعة الملك سعود

    Hyperemia. Definition: increase in blood flow to a particular tissues in the body as a result of active dilatation of arterioles and capillaries. This allows change in blood supply to that tissues. This increase in the blood flow is represented...
  • 1 Community Resilience Learning Collaborative and Research Network

    1 Community Resilience Learning Collaborative and Research Network

    Environmental disasters can cause substantial mental health impacts such as anxiety, depression, and post-traumatic stress disorder 1-4.The likelihood of experiencing adverse mental health conditions are greater for under-resourced communities who are also more likely to face adverse social determinants of...
  • 2D/3D Take-off Candy 2D/3D QTO The objectives of

    2D/3D Take-off Candy 2D/3D QTO The objectives of

    BIM Models. If a BIM model is available it can be used to generate quantities in a similar way to 2D. The advantage with a proper 3D model is that it knows a lot about the dimensions of its elements....
  • Advanced manufacturing Lab, Industrial Enginnering Dep ...

    Advanced manufacturing Lab, Industrial Enginnering Dep ...

    As we are using SSADM as our example structured method, we'll look at the. logical data structure (LDS), which is the name given to the entity model in. SSADM. LDSs are simpler than DFDs in that they have only two...
  • PG&amp;E Low Income Customer Cooling Strategy

    PG&E Low Income Customer Cooling Strategy

    Available without qualification to anyone. Established locations, and available on short notice. Local operation and management tailored to local needs. Temperature threshold set locally. PG&E ESA Authorized* Cooling Offerings. Tune-up and Maintenance Programs for AC
  • Bird Biology and Ecology

    Bird Biology and Ecology

    Origin and Evolution of Birds (Refer to Evolution of Birds Chart) II. Evolved 155 million years ago (mya) During the Mesozoic geologic era, Jurassic. period. Evolved from reptiles, Theocodontia, 200 million years ago (mya). Archeopteryx, the oldest bird fossil, found...