Jim Doran
     

  Retired 37y Distinguished Engineer - IBM Research  
  Adjunct Professor - Western Connecticut State University 
  Active Steel Challenge Competitor 
  Match Director for EFGC in Albany, NY 
  31y Married; Father of 2  
  Connecticut / Florida Resident.  
  FY105260 My Classifications  
  jimdoran64@gmail.com

 

1.  Motivation: after following threads on brianenos.com forums it confused me why SCSA.org limited classifications on their site to Top 20 and also capped them at 100% achievement. Being curious, I manually calc'd some of the Top 20 RFRO classifications that were > 100% and discovered I should be able to produce a larger and more accurate list. I decided to see if I could automate the manual process I used by going 1x1 with a USPSA number and calculating actual classification percent ( sometimes > 145% ) and produce a list as large as I can. Many on the forum threads also expressed a desire to see where their own classification stood in relation to the top shooters. ( i.e. am I in position 88 or 1088 in RFRO )
To achieve this, I needed to do a few things:

    - collect / produce a large pool of USPSA numbers to feed into my program
    - traverse large list and obtain actual classifications for each number
    - Organize data / eliminate duplicates USPSA numbers and not found classifications
    - Organize by Division / Sort / Assemble into a hopefully useful landing page ( I'm an engineer, definitely not a designer)
    - Publish data and presentation to an external website ( Cloud )


2.  Harvest Matches: To build a pool of USPSA Numbers to feed into the classification machine, I accessed the SCSA match results (2023, 2022, 2021), each has more than 2200 matches

3.  Harvest Shooter Data: I iterated through more than 7500 matches accessing the combined list of shooting results for each match. I pulled all of the USPSA numbers for every shooter in every match and built a database. ex: 2023 WSSC Match Given Many competitors shoot multiple guns at several matches throughtout the years which resulted in more than 90,000 numbers, with many duplicates. now had a list of every steel challenge USPSA number in the world (who shot in last 3 years) I de-duplicated the data shrinking my pool to 15,000 unique USPSA numbers ( divison agnostic ). This includes steel challenge competitors who have participated in a sanctioned event over the last 3 years.



4.  Compute realtime and current classifications: Putting it all together: The number crawler builds a pool of 90,000 and reduces to 15000. Then this pool of numbers is fed into a scraper that computes classification results from the actual page that shows the shooters times and matches. It goes to each shooter classification page and collects peak time and total time. Then sorts and assembles all data via python data functions; produces HTML output; and publishes to AWS s3 object store for hosting. Set access permissions, etc I discovered many shooters despite being listed on top 20 no longer have current classifications and have expired data for a division. I set their percentage to zero and they sort to the bottom when I don't find a current classification. .


5.  Cloud Deployment: While I develop on my mac laptop, for durability and performance, I execute all of the longer running web data scraping via the Amazon AWS Cloud and use their S3 service to host the web content. A big piece to overcome was that many hosting platforms such as USPSA & SCSA.org utilize rate limiting ( else they shut your IP down for 24 hours; ask me how I know ). To work around I had to slow my data collection down from hundreds per second to just 1 every 5 seconds. I utilize multiple machines across different geographies to workaround This also allows me to share the data and presentation with all of you.

Special Thanks to fellow SC shooters: Cliff, Ray, Paul, and George for their insight and 2nd set of eyes. Also my wife for dealing with this several week long & late night obsession.
  Cliff P
  Ray H
  Paul S
  George P



  Returnto JD Stats Steel Challenge page