Developing your Data Package
In your new project generated from the template-data-package, the first steps for creating and developing your Data Package are already set up in main.py. For more detailed instructions on using Seedcase Sprout to organise your Data Package, see the guide on Sprout’s website. You can read more about the files and folders created by main.py on the Outputs page of the design documentation.
Creating package properties
Run
main.pyto create thescripts/package_properties.pyfile for the properties of your Data Package by using the recipe in thejustfile:just buildYou can also run
main.pyby clicking the “Run” button in your IDE.Open
scripts/package_properties.pyand fill in all required fields. Also fill in any optional fields you find useful. You can always update these later. Make sure to save the file.In
main.py, uncomment the lines referencing thepackage_propertiesandpackage_pathvariables.Rerun
main.pyto create thedatapackage.jsonfile for your Data Package.
Creating a new resource
If you already have the data
While you can create resource properties without data, it is a lot more challenging. If at all possible, only create a resource properties object when you have data to use to at least pre-fill in some of the important fields. In order to use Sprout, the data needs to already be in a tidy format. When it is, load the data as a Polars DataFrame into the raw_data variable in main.py.
Uncomment lines up to and including the creation of resource properties.
Fill in the
resource_nameargument.Rerun
main.pyto create thescripts/resource_properties_<name>.pyfile for the properties of the new resource.Open
scripts/resource_properties_<name>.pyand fill in all required fields. Also fill in any optional fields you find useful. You can always update these later. Make sure to save the file.In
package_properties.py, import your new resource properties by uncommenting and updating it with the name of your resource. Also uncomment theresourcesfield and update the name of the resource properties in the array to match the name of your new resource.In
main.py, import your new resource properties by uncommenting it and updating it with the name of your resource.Uncomment everything else in the
main.pyfile and rename theresource_propertiesvariable to the name of the new resource properties you just imported.Rerun
main.py. This will:- Update
datapackage.json. - Create a
resources/folder containing a folder for your new resource. In here, you will find abatch/folder with the individual data batches you’ve uploaded for this resource and adata.parquetfile containing all resource data.
- Update