School of Data Science

 

What is being built?

Ricky is building a free open source resource that anyone can use to learn topics relating to the field of data science. It is meant to have no barrier to entry, and keep evolving as an open source project. The project is in the very early phases of being built, and no content is yet available to students (outside of what is available on this website). In recent years open source tools that run on languages like Python, R, and JavaScript, have really taken a lead as the standard tools for working with data and creating engaging analysis/reports. These tools are made by benevolent programmers on the internet who collectively team-up to write software that anyone can use for any purpose (including a for profit business), and yet most of the structured content available online to help students learn these free tools are for-profit. The typical argument on this point is that they need to do so in order to provide the best quality content, so this project takes a different approach to the content that is made - more details on this topic below under “Guiding Principles”.

The school is based in the Metaverse, and is currently located in Cryptovoxels on the island of “Far Far Away” where Ricky owns 12 connected parcels that are dedicated to this project. Students will be able to interact with a “realistic” University campus structure, with the experience heavily relying on scripting to allow students to cater the experience to the subject they want to learn more about and generally having ways of adjusting the experience to be more relevant to the user’s wants/needs. The school will be designed to answer interesting real problems as a somewhat gamified experience, and the hope is users will want to visit to view the latest answers to the interesting problem, or just to have fun, rather than being there with the specific intent of learning data science topics. The best types of learning experiences are those where a user spends their time on the platform because they are being genuinely entertained and learning as a side effect of the time they are spending, and that’s what this project is hoping to achieve.

Students will be given more structured tools to learn the content through a more traditional content provider like Udemy. The course will forever be free of charge, and will answer interesting questions using data from smart contracts across several blockchains and projects. The course will use those topics to walk any beginner through a practical approach to each of those real problems, and will use an older snapshot of data to make it easier to follow for any beginner. Once a student feels more comfortable working with data, they can come to the school of data science in Cryptovoxels to take on any of the same analysis but on actual live data, and they can complete puzzles which will award them NFTs (of varying rarity) and “points”. The Cryptovoxels land for the project is embedded below, followed by more information around the guiding principles and roadmap. In the window below you can input a number and click the button to update the plot being shown, which is made using R. You can also click here to go to a different location where you can input any cryptocurrency symbol, then press the button, and the 3d text will show the current price in real-time.

Guiding Principles

Below are some guiding principles around the long term vision of the project. Today this is just Ricky’s personal project, so the goals below are very much aspirational and not factual to how it works today. See next section below for the roadmap.

  • Open source. Meant as an open project for anyone to use as a learning platform, and also something teachers can also reasonably contribute to.

  • Campus-like experience available to anyone. The experience is meant to feel like being on an actual campus, especially when accessing the world in VR (already possible on Oculus Quest 2). See the design principles for the school that the contributors Powderly and Ottis graciously contributed to the project: https://docs.google.com/presentation/d/1wgkvyf2wX7yl169TN9OgulHmHBDwyuISt4JiersNo-E/

  • All content/tutorials will be built using real live data sources, and each exercise will be designed to be an analysis that keeps improving over time and is always relevant to a real problem, which will help the content not become outdated/deprecated over time, and can keep being improved on by the broader community. This project will also be designed to be open source and have a cryptocurrency governance system (no tokenomics) which will give anyone a path towards becoming a teacher publishing content under the correct/established format of using a decentralized live data connection to a blockchain smart contract, and making the content available to any student. Once there is a consensus that the content should be admitted into the school and the content is singled out as being good, a teacher will be awarded a governance token for the school which they can also use to vote on which students provided excellent answers to a given problem. This wouldn’t be on the short-term roadmap and TBD on more details around this, for example the token is likely to have its functionality be de-activated when traded to another user.

  • A student’s time is valuable and the efforts they are required to put in so they can be tested should not go to waste. A student in our school works on real problems with real live (or close to real-time) data. The long-term vision is to create a platform through which students can submit their analysis for a given problem, and the teachers who have submitted content that made its way into the school are able to upvote submissions they think are good, and have tools to call-out the ones that they believe stand out from the rest. A student wouldn’t be penalized for making a bad analysis or submission (hopefully removing the incentive to plagiarize someone else’s work), but a student who submits really useful/insightful content gets to add to their “score/grade”, and it would be highlighted on the platform on a leaderboard of the “best” solutions for any of the available problems. Anyone would be free to view the analysis, including universities, governments, companies - anyone. A student would be able to opt out, but they wouldn’t be eligible for rewards if they aren’t willing to open source that work.

  • Application permanence. The ownership of the parcel is determined by records on the Ethereum blockchain, and all the content for all parcels is backed up daily by Ricky on the arweave blockchain. As new technology becomes available, one of Ricky’s primary focuses for the project is to create truly permanent applications, both on the data pipeline side of things, as well as the application deployment/hosting/serving side of things.

  • The technology a student wants to use is irrelevant, what is important is that they can take what they’ve learned and apply it to solve a new question they haven’t faced before using reproducible code. So a student will be able to work in whatever technology they choose in order to answer the questions on real/live data, and they will have opportunities to share and showcase their solutions and in-depth analysis separately where the teachers can then single out excellent analysis.

  • The focus to start will be on topics relating to data science because they enable reproducibility on real-time data, but eventually it would be amazing to extend the functionality to other fields. It will take time for things like VR to improve and be more accessible, but it would be incredible to work towards democratizing education for things like biology/chemistry/medicine as well long-term.

 
 

Updates and milestones

JuNE 2021:

  • Successfully created examples pulling data from 4 different live blockchains in Cryptovoxels. Using GraphQL endpoints by The Graph, the data shown has a strong chance of still being around 10+ years from now and is a sustainable way of building a data pipeline (vs. a traditional API which tends to become deprecated over time).

  • Created a first test making a predictive model using live data from The Graph inside of Cryptovoxels

  • Started working towards the idea of a configurable art gallery where the user can define what the art gallery should display. For example the user will be able to specify a tag, and populate the art gallery to show the latest (or most expensive sold) works matching that information, and will offer a high degree of configuration.

July 2021:

  • Created an API that allows a user to provide inputs in Cryptovoxels, execute R code on a remote server, and return the result inside of Cryptovoxels. This milestone will enable Ricky to create very real R tutorials that users will be able to interface with from Cryptovoxels. Actually works very well and provides the foundation to create educational data science related content inside of Cryptovoxels. Potentially the same could be done down the line using Python.

Semptember 2021:

Released some educational content around scripting in Cryptovoxels and using endpoints from The Graph to create dApps on a cryptovoxels parcel. Potentially extend the idea to a Decentraland tutorial.

April 2022:

Ongoing video series “Business Analytics on the Blockchain”: https://www.youtube.com/playlist?list=PL4Vi6FgCp7QxM0uvOtoSJrxztQvNvW4UA

August 2022:

Experiments interacting with smart contracts directly from the metaverse, and connecting a wallet from a VR headset: https://twitter.com/Esclaponr/status/1563912023381008385?s=20&t=X0tvFI5gEn6oOgjJWuOXxQ

 
 

Future Work

  • Continued experimentation around best ways to bridge educational content into an interactive experience in the metaverse.

  • Create experiences that enable users to learn on live data, and answer questions on the real-time data to be awarded NFTs.

  • Keep building a University “campus” in the metaverse. Make a powerful free library of books and content that is easily queriable. Make classrooms that a user can configure to different subjects, different wallets to analyze, and things of that nature. Make buildings for students to congregate and share ideas, as well as a large auditorium for live lectures, speakers, conferences.

  • Begin the decentralization of the platform and write custom smart contracts. Create ways for anyone to become a teacher, and iterate on the open source contribution mechanisms and the governance established by the tokens separately awarded to teachers and students based on their activity and contributions.

  • Create ways for the students to get certificates of completion and for the content to be recognized as a valid way for students to show an employer they have a particular skillset. Expand the visibility and benefit to students who come up with great solutions on real problems.

 
 

FAQ

  • Does this cost Ricky money to maintain?

    • Yes and no. It costs me money to maintain my own powerful SQL database, Servers, websites, and all sorts of things. But the long-term vision is meant to work as a permanent application that functions as a public piece of infrastructure. The plan is to set up the systems/apps to have enough funds to keep functioning for ~10 years as a default, and have ways where anyone can send funds to a wallet and extend its lifespan. So the system would theoretically be designed to keep working if something happened and Ricky couldn’t afford to maintain the project anymore. But this will take time as projects that enable for this to be a reality are relatively new (The Graph - GRT, the Internet Computer - ICP, arweave - AR), and Ricky is still learning to use the technologies.

  • What is not allowed as a user?

    • Purposely trying to overload the content with spam traffic.

    • Creating tools that people can use to cheat in the school and create an answer bank for people to search from.

  • What is done towards the content being “permanently” available?

    • The ownership of the parcels of land is established through the Ethereum blockchain, while the content on the parcels is backed up daily on the arweave blockchain. As scripts and content become available they will keep being backed up on whatever blockchains and tools give the strongest chance towards permanence. The data sources powering everything will run on The Graph which solves this problem on the data side of things which is extremely helpful towards this project. Ricky is actively working hard to make this a reality and he believes this to already be pretty doable on some of the new blockchain technology that has gone into their production mainnets. At least in the sense that if the blockchain in charge of a permanent backup disappears, that same information can be migrated to a new and better tool. In fact that would be great and one would hope we collectively keep innovating and pushing what is possible and safe to do with smart contracts.