Looking for the best way to store and manage tags in the HPR database, part 1
Hosted by Dave Morriss on 2017-03-10 is flagged as Explicit and is released under a CC-BY-SA license.
Listen in ogg,
mp3 format. | Comments (3)
We have been collecting and storing tags for new HPR shows for a while now with the intention of eventually offering a search interface. In addition, a number of contributors, including myself have been adding tags (and summaries), to shows that do not have them, since August 2015. There is still a way to go, but we’re making progress. At the time of writing (2017-01-31) 56.29% (1248) of all HPR shows (2217) have tags.
In recent times the way in which we should use these tags has been discussed. In show 2035 on 2016-05-20 droops suggested:
The website, which is a lot of work, needs to have related shows listed on each individual show’s page. This will take a tag system and someone to tag all of the almost uncountable previous episodes.
This episode begins a discussion about some of the ways that tags can be stored, managed and accessed efficiently in the HPR database.
I started planning a show about this subject in the summer of 2016, and the amount of information I have accumulated has grown since then. There is now quite a lot, so I am going to split what was originally going to be one show into three.
The subject becomes quite technical in the later shows, discussing database design techniques, and all three of the shows contain examples of database queries and scripts. If you are not interested in this subject than feel free to skip past. However, you might find this first episode more palatable, and any thoughts you might have on the subject would be appreciated.
I have written out a set of longer notes for this episode and these are available here.
Subscribe to the comments RSS feed.
Comment #1 posted on 2017-03-10T00:23:00Z by Mike Ray
See show 1569:
How to do a many-to-many relationship in a database.
Comment #2 posted on 2017-03-10T10:40:05Z by Dave Morriss
I hadn't forgotten your excellent show. It's been in my list of references all along. However, since I started by designing a single show which then got split into three, reference to show 1569 got relegated to the last show in the series.
I didn't quite appreciate the effect that would have, since the three shows were still one in my head. As it stands it looks as if I have disregarded your contribution, whereas what I had wanted to do was move slowly towards it, looking at possible alternatives and showing their advantages and disadvantages along the way.
Show two is in the queue for the 31st March, but show three is still in production. It will be the next show I upload though.
Comment #3 posted on 2017-03-10T16:42:55Z by Mike Ray
Listen to the entities
Wherever possible all database design should be driven by what the entity relationship is telling you, and Mr Codd should be obeyed.
In this case there are just two entities; 'show' and 'tag'. And their relationships are:
Show can have one or more tags
Tag can appear attached to one or more show.
Which gives rise to the many-to-many relationship like this:
The show_tag_xref table has a compound unique key comprised of the key column from the two outside tables, show and tag.
That's the pure analysis of the two entities concerned.
I can't think of any processing constraints, like speed or storage that would compel that relationship to be compromised. As you said in your part 1, this is a small database.
Note to Verbose Commenters
If you can't fit everything you want to say in the comment below then you really should record a response show instead.
Note to Spammers
All comments are moderated. All links are checked by humans. We strip out all html. Feel free to record a show about yourself, or your industry, or any other topic we may find interesting. We also check shows for spam :).