Notes on getting started with CMSSW and Heavy Ions at CMS

Everything you need to know to make use of the CMS Tier-2 / Tier-3 analysis center.
http://www.cmsaf.mit.edu/twiki/bin/view/Cms/CmsafUserGuide
Configuration
From cgate.mit.edu one needs to excecute a environment configuration running source .login

which contains

 
#default settings for CMSROOT and ROOT
export SCRAM_ARCH=slc4_ia32_gcc345
source /osg/app/cmssoft/cms/cmsset_default.sh
cd CMSSW_2_2_9/src/
eval `scramv1 runtime -sh`
source /app/cms-sl4/cmsset_default.sh
project CMSSW
source /osg/app/cmssoft/cms/cmsset_default.sh

Set your runtime environment:
 
cd CMSSW_2_1_12/src
cmsenv

Setup root
 
cd $HOME/CMSSW_x_y_z/src
echo "Using x_y_z"
eval `scramv1 runtime -csh`                                   #shown for C shell
cmscvsroot CMSSW

to browse root files gotten from StartKitAnalyzer _cfg.py one needs to copy rootlogon.C in the working directory

What should be understood ?

  • EDAnalyser
    • How to loop over particles
    • How to retreive information from events
  • How configurations files are organised
  • How does the TFileService work : deal with opening files, and storing output of histograms correctly

Modules

Modules have a defined directory structure

.admin
data
doc
interface
src
test
  • interface holds the header files
  • src holds the source files
  • data holds module specific parameter-set fragments (see parameter-sets)
  • test test code and configurations for the module

Modules are configured and scheduled in a parameter-set which is passed to cmsRun as a command line parameter cmsRun <parameter-set>

More information about CMSSW framework is very well described here

  • every parameter-set should include a MessageLogger
    • A parameter-set is used to configure modules and to schedule their execution in a given order. It is define as a set.
    • the top level parameter set is called process and is composed of
      1. services
      2. modules to be excecutes (like EDAnalyser)
      3. the paths and ending which dictactes to cmsRun what to do when

Accessing data
The entry point for the cmsRun executable is an event source:

* MC generators can be used to generate events which are then reconstructed and stored in files. * CMSSW ROOT files can be opened by the PoolSource and processed.

untracked PSet maxEvents = {untracked int32 input = -1}

source = PoolSource 
{
  untracked vstring fileNames = {"file:tutorial-simulation.root"}
  untracked uint32 skipEvents = 0
}
  • Files from various sources can be opened by the PoolSource specified by an identifier in front of the file
    • file: opens files from the local filesystem
    • dcap: opens files from dCache (URL syntax: =dcap://cmsdca.fnal.gov:24136/pnfs/fnal.gov/usr/cms/WAX/11/store/ ... =)
    • dcache: opens files from dCache (pnfs syntax: =dcache:/pnfs/cms/WAX/11/store/ ... =)
    • rfio: opens files from Castor and DPM

  • If no identifier is specified, the Trivial File Catalog is used to prepend the site specific mass storage information
    • at FNAL, /store/... is translated into dcap://cmsdca.fnal.gov:24136/pnfs/fnal.gov/usr/cms/WAX/11/store/...

Where to get data sample ?
see https://cmsweb.cern.ch/dbs_discovery/getLFNsForSite?dbsInst=cms_dbs_prod_global&site=ccsrm.in2p3.fr&datasetPath=/RelValTTbar/CMSSW_2_1_0_pre2-RelVal-1208465820/RECO&what=cff&userMode=user&run=*

for example /store/relval/2008/4/17/RelVal-RelValTTbar-1208465820/0000/FAAF43EC-0C0D-DD11-B3EC-000423D996B4.root

ls -latrh  /pnfs/cmsaf.mit.edu/t2bat/cms/store/user/davidlw/HYDJET_GEN_X2_MB_NEW_4.0TeV/HYDJET_GEN_X2_MB_NEW_4.0TeV/85f6e5122c18f2b815f0ff541caae499/*

Look at a data file
TDCacheFile *_file0 = TDCacheFile ::Open("/pnfs/cms/WAX/11/store/relval/CMSSW_2_2_8/RelValZMM/GEN-SIM-RECO/STARTUP_V9_v1/0000/6820DE4B-BE2C-DE11-975D-000423D99658.root");

Simple example to copy some data
try to run cmsRun copy_cfg.py or other example from https://twiki.cern.ch/twiki/bin/view/CMS/WorkBookWriteFrameworkModule

Output files
  • Output is handled by the PoolOutputModule:
module Out = PoolOutputModule
{
  untracked string fileName = "tutorial-cal.root"
}

Looking for code
• If you already know where to look - cvs browser: • In all other 99% of the cases - lxr browser: • or simply at command line:
cmsglimpse <string>

Good excercise

  • The complete tutorial is here

Make an edanalyzer
workbook

Edit Demo/DemoAnalyzer/BuildFile: so that it looks like this(add DataFormats ?/TrackReco as shown)

<use name=FWCore/Framework>
<use name=FWCore/PluginManager>
<use name=FWCore/ParameterSet>
<use name=DataFormats/TrackReco>
<flags EDM_PLUGIN=1>
<export>
   <lib name=DemoDemoAnalyzer>
   <use name=FWCore/Framework>
   <use name=FWCore/PluginManager>
   <use name=FWCore/ParameterSet>
   <use name=DataFormats/TrackReco>
</export>

Edit Demo/DemoAnalyzer/src/DemoAnalyzer.cc:

* Add the following include statements (together with the other include statements):

#include "DataFormats/TrackReco/interface/Track.h"
#include "DataFormats/TrackReco/interface/TrackFwd.h"
#include "FWCore/MessageLogger/interface/MessageLogger.h"
* Edit the method analyze which starts with

DemoAnalyzer ::analyze(const edm::Event& iEvent, const edm::EventSetup& iSetup)

and put the lines below using namespace edm;

Handle<reco::TrackCollection> tracks;
     iEvent.getByLabel("generalTracks", tracks); 
     LogInfo("Demo") << "number of tracks "<<tracks->size();

To print out the information that we added in the analyzer, we need to replace the line process.load("FWCore.MessageService.MessageLogger_cfi")) in file demoanalyzer_cfg.py with the segment below

# initialize MessageLogger and output report
process.load("FWCore.MessageLogger.MessageLogger_cfi")
process.MessageLogger.cerr.threshold = 'INFO'
process.MessageLogger.categories.append('Demo')
process.MessageLogger.cerr.INFO = cms.untracked.PSet(
    default          = cms.untracked.PSet( limit = cms.untracked.int32(0)  ),
    Demo = cms.untracked.PSet( limit = cms.untracked.int32(-1) )
)
process.options   = cms.untracked.PSet( wantSummary = cms.untracked.bool(True) )

* Edit the Demo/DemoAnalyzer/src/DemoAnalyzer.cc file. Add a new member data line to the DemoAnalyzer class:

private:
// ----------member data ---------------------------
unsigned int minTracks_;
* Edit the constructor
DemoAnalyzer::DemoAnalyzer( const edm::ParameterSet& iConfig)

So now this segment will look like this( Note: that first LogInfo ("Demo").... has been commented, since now we want ="number of tracks" to be printed only if minTracks_ is greater than a certain number as you will see below

DemoAnalyzer::analyze(const edm::Event& iEvent, const edm::EventSetup& iSetup)
{
   using namespace edm;
  
     Handle<reco::TrackCollection> tracks;
     iEvent.getByLabel("generalTracks", tracks);
    // LogInfo("Demo") << "number of tracks "<<tracks->size();
        
        if( minTracks_ <= tracks->size() ) {
            LogInfo("Demo") << "number of tracks "<<tracks->size();
         }

Also to see if actually minTracks_ gets used, replace the line process.demo = cms.EDAnalyzer('DemoAnalyzer' ) in demoanalyzer_cfg.py by the segment below to use a value of say=minTracks_ 50

process.demo = cms.EDAnalyzer("DemoAnalyzer",
           minTracks = cms.untracked.uint32(50)
         )

1. Edit the configuration file, demoanalyzer_cfg.py, add the EventContentAnalyzer module and add it the path so that script has the following lines. The second line should already be there in the code from before.

process.dump=cms.EDAnalyzer('EventContentAnalyzer')
process.p = cms.Path(process.demo)
process.p=cms.Path(process.dump)
In this case you probably want to run over only one event, change the number of events to 1 ( default is -1 which means all events)
process.maxEvents = cms.untracked.PSet( input = cms.untracked.int32(1) )

1. Edit the configuration file, demoanalyzer_cfg.py, and add the Tracer service ( just above the line process.p = cms.Path(process.demo)) in =demoanalyzer_cfg.py =

process.Tracer = cms.Service("Tracer",
      sourceSeed = cms.untracked.string("$$")
   )

Example of analysing track information Next we replace the example code in MyTrackAnalyzer ::analyze with

void
MyTrackAnalyzer::analyze(const edm::Event& iEvent, const edm::EventSetup& iSetup)
{
   // Get Inputs
   edm::Handle<reco::TrackCollection> tracks;
   iEvent.getByLabel(trackProducerTag_, tracks);
   
   edm::LogInfo("Tutorial") << "number of Tracks: " << tracks->size();

   for ( reco::TrackCollection::const_iterator track = tracks->begin(); track != tracks->end(); ++track ) {
      edm::LogInfo("Tutorial") << "Perigee-Parameter: q over p: " << track->qoverp();
      edm::LogInfo("Tutorial") << "Perigee-Parameter: lambda: " << track->lambda();
      edm::LogInfo("Tutorial") << "Perigee-Parameter: theta: " << track->theta();
      edm::LogInfo("Tutorial") << "Perigee-Parameter: d0: " << track->d0();
      edm::LogInfo("Tutorial") << "Perigee-Parameter: dz: " << track->dz();
   }

}

CVS

https://twiki.cern.ch/twiki/bin/view/CMS/WorkBookSetComputerNode#CreateWork

scramv1 list CMSSW

cmscvsroot CMSSW
cvs login (98passwd)
cvs co -r CMSSW_2_2_9 GeneratorInterface/HydjetInterface

check out this package from cvs under cmssw_2_2_10

cvs co -r CMSSW_2_2_10 SimG4Core /Application/test

To take care that the package comes for the release you are currently using, but without letting you to check out only the test directory.

addpkg SimG4Core /Application

Moving from CMSSW_2_2_9 to CMSSW_3_1_0_pre7

#include "SimDataFormats/HepMCProduct/interface/HepMCProduct.h"
to
#include "SimDataFormats/GeneratorProducts/interface/HepMCProduct.h"

  • change to "generator" instead of source in the py configuration file

  • error:
Duplicate Events found in entire set of input files.
Both events were from run 1 and luminosity block 1 with event number 7.
The duplicate was from file dcap:///pnfs/cmsaf.mit.edu/hibat/cms/users/davidlw/HYDJET_Minbias_4TeV_31X/gen/HYDJET_Minbias_4TeV_seq100.root.
The duplicate will be skipped.
Paolo MERIDIANI wrote:
>>> Please add in your source
>
>>> process.source = cms.Source("PoolSource",
>>> noEventSort = cms.untracked.bool(True),
>>> duplicateCheckMode = cms.untracked.string('noDuplicateCheck')
>>>
>>> even if the random seeds are different for each job (1000 events each 
>>> job) the same run number is instead assigned, triggering the standard 
>>> duplicate event checking when reading

How do I find what are the methods you can you on an object ?

  • If it's a standard object, look up in google, ex
HepMC::GenParticle
  • if it's not standard, look at the file in the CVS tree, for example trying one of the included files

MC Truth Match

For instance, to plot the reconstructed muon pt versus the true pt, from generator particles, you can use the following interactive ROOT command:

Events.Draw("allMuons.data_[allMuonsGenParticlesMatch.map_.first].pt():
  genParticleCandidates.data_[allMuonsGenParticlesMatch.map_.second].pt()")
#edit the above two lines to be a single line

For each matched object of any type, a unique object in a collection of GenParticle objects is stored. For convenience, the following typedef is defined in DataFormats /HepMCCandidate/interface/GenParticleFwd.h:

namespace reco {
  typedef edm::Association<GenParticleCollection> GenParticleMatch;
}
An example of code accessing an association is the following:
  Handle<GenParticleMatch> match;
  event.getByLabel( "zToMuMuGenParticlesMatch", match );
  
  CandidateRef cand = ...; // get your reference to a candidate
  GenParticleRef mcMatch = (*match)[cand];

nice graphics

http://indico.cern.ch/getFile.py/access?contribId=86&sessionId=22&resId=0&materialId=slides&confId=46769

-- CatherineSilvestreTello - 14 May 2009

Topic revision: r13 - 2009-07-06 - 13:46:42 - CatherineSilvestreTello
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback