admin said in #3413 8h ago:
We've been slaving away for you in the code mines over the past few weeks trying to polish up the sofiechan tag system and UI, and we have some new gems. I'll focus on the tag system stuff today:
The autotagger is now live. Threads are ingested into a word frequency model that now spits out evidence of which thread belongs in which tags. It's a semi-supervised model mixing together our votes and the words we use to estimate tag mixtures. Even if we never suggested any tags, it would come up with its own based on word frequencies, sort posts into them, name them, and they would even be somewhat coherent. We tested this thoroughly to make sure the autotagger adds useful signal. It's not as clean as our votes, but it's more thorough and scalable. We hope the combination is an improvement once we get used to it. We'll continue to tune and experiment with it, but it's close enough now to be worth shipping.
The first new feature you'll notice from that is the tag "word cloud" parody quotations on the tag pages with the typical information-bearing words of each tag. The point is to give you a quick objective sense of what a tag is about without anyone having to write definitive copy. It also serves as a guide for identifying and (re-)naming misnamed tags.
Speaking of which, tag naming is now live. If you hover over the title of a tag on its page, you'll see the top suggestions from the autotagger and other people for what that tag should be called. Please vote for your favorite tag names. The tags now fully belong to you, the posters. Feel free to seize control of a tag or two and turn them into your own personal fiefdom. That's what they are for. I'm going to turn on the actual results once we have enough votes to stabilize the names of the best tags, probably later this week. After that, tag names will be fully community driven like everything else.
If tags have few threads or no good names, they will get denamed. Denamed tags lose their votes and are given fully over to the autotagger to redefine as it sees fit. They will reappear later as whatever rough cluster the autotagger can piece together, with a roughly suggestive name. But the real game is you the posters taking control of the tags and using them to drive the conversations you want to see. The autotagger is just a backstop. I will tune this dename threshold up and down occasionally to smoke out the bad tags. Let's find the good ones and occupy them.
The idea with tags is to give us both a topical index over our discussions, to focus and suggest our areas of interest, and to create space for sub-conversations and sub-communities to develop without seeing or worrying about what other people are discussing elsewhere. Right now we don't have the scale for this to matter. It's still possible to use sofiechan entirely from the front page. But hopefully soon we're going to need to break out into topical sub-boards. The tag system is now fully ready for that, and we'll see what kind of problems come up in practice before we do more with it.
Happy posting
The autotagger is now live. Threads are ingested into a word frequency model that now spits out evidence of which thread belongs in which tags. It's a semi-supervised model mixing together our votes and the words we use to estimate tag mixtures. Even if we never suggested any tags, it would come up with its own based on word frequencies, sort posts into them, name them, and they would even be somewhat coherent. We tested this thoroughly to make sure the autotagger adds useful signal. It's not as clean as our votes, but it's more thorough and scalable. We hope the combination is an improvement once we get used to it. We'll continue to tune and experiment with it, but it's close enough now to be worth shipping.
The first new feature you'll notice from that is the tag "word cloud" parody quotations on the tag pages with the typical information-bearing words of each tag. The point is to give you a quick objective sense of what a tag is about without anyone having to write definitive copy. It also serves as a guide for identifying and (re-)naming misnamed tags.
Speaking of which, tag naming is now live. If you hover over the title of a tag on its page, you'll see the top suggestions from the autotagger and other people for what that tag should be called. Please vote for your favorite tag names. The tags now fully belong to you, the posters. Feel free to seize control of a tag or two and turn them into your own personal fiefdom. That's what they are for. I'm going to turn on the actual results once we have enough votes to stabilize the names of the best tags, probably later this week. After that, tag names will be fully community driven like everything else.
If tags have few threads or no good names, they will get denamed. Denamed tags lose their votes and are given fully over to the autotagger to redefine as it sees fit. They will reappear later as whatever rough cluster the autotagger can piece together, with a roughly suggestive name. But the real game is you the posters taking control of the tags and using them to drive the conversations you want to see. The autotagger is just a backstop. I will tune this dename threshold up and down occasionally to smoke out the bad tags. Let's find the good ones and occupy them.
The idea with tags is to give us both a topical index over our discussions, to focus and suggest our areas of interest, and to create space for sub-conversations and sub-communities to develop without seeing or worrying about what other people are discussing elsewhere. Right now we don't have the scale for this to matter. It's still possible to use sofiechan entirely from the front page. But hopefully soon we're going to need to break out into topical sub-boards. The tag system is now fully ready for that, and we'll see what kind of problems come up in practice before we do more with it.
Happy posting
referenced by: >>3415
We've been slaving a