Creating a domain and data model and finding a suitable representation

budget-planner
Tags: #<Tag:0x00007f5bcbca8ea0>

#1

After (almost finishing the) finding of :checkered_flag::ballot_box_with_check: specifications for the development environment, we are currently in phase of :checkered_flag::arrow_forward: domain and data modelling for reusability and interoperability as preparation for :checkered_flag::pause_button: prototyping a workable example of the full stack for Thessaloniki beginning of November. Further on we seek to entangle our work with our ecosystem. To support this cooperative praxis, I am taking a moment to reflect on the current state of the discussion and invite for comments about possible paths to take.


Domain-driven design and model-driven development

When approaching to find a :grey_question: shared language we are not only left with the hard work of formulating an abstract agreement on the glossary of terms and their relations, as we approached during discussion of the :memo: SolidBase datastructure(s), but also within the constraints of the machine-readable representations we have at hands.

In rewriting :open_book: presentation of budgets user story @gandhiano, @yova and me scoped acceptance criteria and concrete tasks. We discovered that we initially skipped a thorough and clear definition of What is a budget? for us. A stabilising mutual understanding has been derived in a process of learning from the trainers manual and from exemplary spreadsheets of the preliminary research.


One possible implementation path

At the moment we are consolidating our different views on the subject in more structured forms. Despite we still aim at clearly separating metadata and data, linked open data challanges our understanding of data representation with the fact that (1) both are expressed in the same graph format and (2) that “:page_facing_up: the Semantic Web technology stack […] [is] not hiding the fact that Reality is Hard”.

One answer to this challenge can be to maintain an :open_book: ontology for SFS budgets in the TransforMap Wikibase, as we previously did with multilingual taxonomies for :link: SSEDAS and :link: TransforMap and :link: other taxonomic explorations in there.

If the Wikibase instance is set up accordingly for :page_facing_up: content-negotiation, its IRIs to the Ps and Qs may possibly be used for describing an ontology that is reusable for objects persisted with SoLiD. The question is to mediate between the Wikibase way of doing Linked Data and the Linked Data Platform standards, in so we can serialise our data to be persisted in RDF, the Semantic Web data model.

An example on how this may work can be found within the source code of the TransforMap viewer and the sources of its counterpart.

The procedure can be manually reproduced by

  1. Selecting the transformap namespace in https://query.base.transformap.co/bigdata/#namespaces by clicking the Use link next to it and
  2. Entering the following query into https://query.base.transformap.co/bigdata/#query
    prefix bd: <http://www.bigdata.com/rdf#>
        prefix wikibase: <http://wikiba.se/ontology#>
        prefix wdt: <https://base.transformap.co/prop/direct/>
        prefix wd: <https://base.transformap.co/entity/>
        SELECT ?item ?itemLabel ?instance_of ?subclass_of ?type_of_initiative_tag ?wikipedia ?description
        WHERE {
          ?item wdt:P8* wd:Q8 .
          ?item wdt:P8 ?subclass_of .
          OPTIONAL { ?item wdt:P4 ?instance_of . }
          OPTIONAL { ?item wdt:P15 ?type_of_initiative_tag }
          OPTIONAL { ?item schema:description ?description FILTER(LANG(?description) = "en") }
          OPTIONAL { ?wikipedia schema:about ?item . ?wikipedia schema:inLanguage "en"}
          SERVICE wikibase:label {bd:serviceParam wikibase:language "en" }
    }
    

Further discussion of this happened in


Then @yova also discovered further resources that are opening the question how we can reuse terms and concepts from preliminary work.

Linked Data for Financial Reporting

:page_facing_up: xBRL - the business reporting standard

:page_facing_up: Project to Convert XBRL Financial Information to RDF/OWL

:abacus: XBRL Cloud: Evidence Package - Component Perspective

To conclude, @yova expressed:

If I get it right, all to do for setting up a graph based datastructure is to define the concept of SFS-budgeting somehow like that:

:page_facing_up: Extensible Business Reporting Language (XBRL) 2.1 - 5.1.1 Concept Definitions


Outlook

We are aware that an ontology is a specialised form of a vocabulary. For our case it may be sufficient to describe a hierarchic topology of terms in a tree structure. This is called a taxonomy.

  • Do we all agree on this development direction?
  • What might be its caveats?

I feel we can only continue the discussion of the desired relations between the terms, once we transform the diverging sketches from the pad into a more structured and less fluid depiction in the Wikibase.


#2

Thank you for this summarizing status report @yala! Maybe you can expand some of the abbreviations (IRI, P, Q) to enhance the understanding on not that LOD-experienced readers?

Please also see the first attempts of simplifying SFS economics: Attempt on unifying SFS economics

After having a second look at the ESSglobal vocab, it seems to be a good starting point for our endeavor ;). Although it misses concepts for cost periodicity and other details. So a extesnion using base.transformap.co seems to be necessarry.


#3

IRI = Internationalised Resource Identifier

https://tools.ietf.org/html/rfc3987

Ps = P-numbers
!s = Q-numbers

From https://www.wikidata.org/wiki/Wikidata:Glossary we can take:

Entity […] Every entity is uniquely identified by an entity ID , which is a prefixed number, for example starting with the prefix Q for an item and P for a property.

Where should I put my focus when reading this, and how would that relate to the above?


#4

It’s basically examples for datasets we need to implement.


#5

You are talking about the linked spreadsheets?


#6

yes


#7

Thank you for clarifying your motivation in linking that conversation.


#8

Hi all, @yala invited me and @bhaugen to comment, and we are very happy to be here. Really interesting use case to us!

We have looked through this https://hack.allmende.io/solidbase_datastructure# and have a few questions before responding with something we hope will be useful.

  1. What does a budget line represent in the real world? (We understand it is a line in a report.)

  2. Similarly, what is an Activity? How do you think about it? It is something that somebody does? Now? Repeatedly over the course of a year?

  3. In one place I saw multiple accounting periods for 1 budget, in another place I saw 1 accounting period for 1 budget. But then I also see that an amount is for the interval of an accounting period… or does that mean an amount for an interval, like nnn per year, not so much nnn for 2019? Can you describe a little more what users are thinking of for accounting periods at what level of the budget?

  4. What are “members” of a budget?

  5. In Effort, what is “factor”?

  6. What are “marketing channels” and how do they fit in to the use case?

Hopefully these are not too annoying, and if there is another place we should look first, then please let us know. We think you all are going in a good direction and just want things to be more clearly defined so we can think about the overall model with more clarity.

Thanks!

Lynn and Bob
mikorizal.org
valueflo.ws


#9

I am starting to discuss budgets and planning in this ValueFlows issue:

I cited your use case in this comment: https://github.com/valueflows/valueflows/issues/284#issuecomment-432211753

I’d be grateful if you looked at what I wrote about Solawis and corrected any misunderstandings. I am finished with the preliminaries now, and will go on to sketch out a ValueFlows model for budgets, and then map the VF model to your model and see how they compare. I think they will be compatible.


#10

Dear @fosterlynn and @bhaugen,

thanks a lot for your review of our sketches! I hope we’ll achieve a working data structure for our tool with your support soon. Your questions are not annoying at all, but help us to gain more clarity of the subject.
To answer your questions:

  1. We use the word budget for the line up all costs of production of the Solawi to divide it by the number of members to get the cost per share. One line of this budget is a budget line and represents costs of one category comprehensible to the membership. They are the summed up costs of similar kind arising from activities.
  2. Activities are all economic actions that need monetary representation or compensation and different abstraction levels. So planting 1 acre carrots (production procedure) might be an activity, as well as operating a horticulture (business branch) for a year or running a solawi. Because of the last, that actually the whole endeavor could be seen as a abstract activity we came to the conclusion to equal budget line with activity.
  3. Yay, the time aspect is still a bit unclear. As top priority we actually only need a presentation of annual values, but in real life, for usage on the farm, esp. for setting up the budget and for controlling/monitoring running costs, it would be quite useful to be able to split it up to months. Actually the effort needs to be split up in 12 parts, each representing a month.
  4. The budget is the total costs of production done for members. But as both budget and members belong to the same, say, SFS, and share the same time aspect, they don’t need ro be directly connected to budget. But it’s important to note that SFS change members annually.
  5. The factor is another word for price. It’s the number which with a non monetary value is multiplied to get a monetary value.
  6. Perspectively this tool migt be of use for economic monitoring of farms with also other marketing channels than CSA. Tags on activities might help to separate activities for different marketing channels. But that’s not that well thought-out yet. I think tags are always nice to have for transversal sorting of things…:wink:

#11

I find that the use of the word factor can be confusing. Why not use cost (per unit being unit cost and total being total cost), which is pretty standard and not associated to monetary transactions (i.e. not directly related to price, costs can also be non-material)?


#12

Is “Effort” the same as “inputs” or “requirements” for an Activity? Like, what do we need to perform that activity? Materials, equipment, work, etc etc?


#13

@bhaugen: Efforts are the costs for an activity. And they can be nested. Materials are activities of buying things(month based), equipment is a investment, so actually a activity that runs several years (good point!:g, accounting could be done on a yearly base), work is just a activity with a special cost per hour(monthly).

@gandhiano I don’t like these financial terms, thats why I always try to find more general/neutral ones. But it may be that total cost/unit cost would be more comprehensible.


split this topic #14

A post was split to a new topic: Clarifications about the Solawi model


#15

I just reminded myself of our earlier conversations on this topic:
https://www.loomio.org/d/U0p98Cye/ovn-and-sfs

Good to see that it continues to make progress. You asked a couple of other questions in Loomio that I will reply to there.


#16

I started thinking about your models. I’m missing plenty of important details, but am trying to think about the center of it first, and see what you think along the way. I’ll try a draft of the whole model next.

This starts with a re-draw of your last picture from https://hack.allmende.io/solidbase_datastructure#.

Then the bottom is a UML-ish representation of that with a little ValueFlows thrown in.

Note: This shows the budget line items using the resource classifications as their grouping. It could also be process classifications (types of activities). I think they can just be summarized data, although maybe you will need to note somehow what classifications should be used for the budgeting.
solawi-1

Process Classifications are Farming, Gardening, Planting. Resource Classifications are Labor, Land. Actions are use, work, could be others, like consume for seeds. And this shows only inputs.

I like the name Activity. Not sure I like Effort, seems too narrow. Will think more.

Does this seem useful at all to you? Please be honest. :slight_smile: Suggestions of how to move forward appreciated.


#17

Thank you for this intermediary, visual thought piece. I am about to start a metastable human- and machine-readable data model at https://base.transformap.co/wiki/Item:Q237 right now. While this will be a bit rough on the edges and has no visuals available immediately¹, I hope we can use this as a directory of terms for us, and as a space where we can design their relations.

The site works pretty much like Wikidata, only that we miss to federate their items and properties until now. This means we need to recreate any possible subject or predicate that we found elsewhere and that we will want to use. Some of the first search results for introduction to wikidata: https://www.wikidata.org/wiki/Wikidata:Introduction, https://www.bjoern-hartmann.de/post/a-brief-introduction-to-wikidata/

Please ask here by sending a Private Message via the blue botton on https://discourse.solawi.allmende.io/u/yala if you wish an invitation to access that site. Due to spam, public registration is currently turned off. I will send you an invitation to create a user upon request.

What I see is difficult to perceive in your image, is that the upper part is an example of how to use the items from the lower part as a vocabulary for actual data. I recognise that this is similar to what we see at the bottom of the pad: a graph visualisation of a set of statements describing a budget.

It’s interesting for me to witness in your example how a budget line can be thought of as a compound of multiple efforts, each that make up parts of different activities. As the pad, and our discussion, are currently cycling around different, some probably outdated parts of the shared model, I hope the glossary will help to choose which terms survive the evolution of models here.


¹ For that to work we need to fork some of the nice applications in https://www.wikidata.org/wiki/Wikidata:Tools/Visualize_data and adapt them to our data source, which is not Wikidata. This could eventually result in PRs upstream commiting a choose your endpoint feature.


#18

I have created a few exemplary items and properties now which make up parts of what we modeled above. I am not yet sure how to represent the higher-order of describing the data model, before describing an actual instance.

In the item pages below, the Q-numbers, you will therefore find references to a Blank node, but also additional metadata describing what is being expected in that kind of relation. I sense I am modeling classes of objects that can be instantiated.


These are helpers that I created:



I have stopped before modelling efforts, because @fosterlynn’s new ideas about process and resource classifications, plus actions kind of distracted me. But they are very useful input!

A difficulty that we will find is, that Wikibase itself already offers multilingual labels and descriptions for each item and property, why it is a bit tedious to name those as individual classes themselves. But I think it is good to describe the actual expected properties of an instance here, since we will be using those IRIs to denote keys in a namespace for the @context of our RDF objects, that are going to be serialised as JSON-LD for SoLiD. How such a vocabulary needs to be modeled is a second degree question, as it does not need to be dereferenceable right from the start. Many vocabulary definitions are offline for some that are still in use. Diving through lov.okfn.org can lead to sheer surprises.

You can find a summary of recent activity on the page https://base.transformap.co/wiki/Special:RecentChanges

Please tell us what you see and how you could imagine this helps us to define in a wiki way of what we are talking about.


#19

Thanks for spotting this. This was an attempt at allowing budget at different levels of granularity. Imagine a yearly budget with amounts for every activity, but when you zoom in you see it is comprised of individual amounts for each month. These could be the multiple intervals you saw.

In another way of zooming in, one could imagine that for the past, budgeted values are accompanied by a second line below showing the actual balance of a given activity per interval.

Here we wanted to prepare for the case that users are not only going to model an anticipated budget for a year based on yearly values, but might project monthly values from the accounts in the past into the future, like GnuCash does.


#20

Thank you all for the input!

As we now have a basic taxonomy, we can start to create the workflow of shuffling data from and to SoLiD with the app.

I think we can use our experimental self made model for now but perspectively and to keep backward compatibility with other accounting methods it is important to keep in mind that we are handling only with financial data here, for which already some interesting languages exist like http://www.xbrlsite.com/2015/fro/ and http://xbrl.squarespace.com/conceptual-model/ . For importing financial statements this would probably the future way to go, but for now we can continue with our selfmade vocab.