CWCki talk:Transcription Collective

From CWCki
Jump to navigation Jump to search

To do

  • make neato user box to identify users who are part of the Collective <--- Done? 37 Rb 85.468 20:38, 24 February 2010 (UTC)
  • promote it and stuff.
  • have this page serve as a place to discuss meta transcription issues.

Point is, Clyde likes to joke that I need to get transcribing [insert new Chris media here]. But in reality, there's a lot of hard working transcriptionists who manage to transcribe new videos within minutes of them being released. It's something that very much impresses me and what makes CWCki so damn great.

One of the goals should be to utilize such efforts towards completing a lot of unfinished transcripts and to allow for better collaboration on transcription projects.--Champthom 18:07, 24 February 2010 (UTC)

  • As the author of CWCki's style guide (or what's currently passing for one), I figure I'm one of the dudes you're talking about. You can count on me to keep our transcripts stylish. ;3 Llort 20:06, 24 February 2010 (UTC)

So is this a good place to discuss color-coding?

Because I'd like to address that. I personally find it easier to read when we only color the speaker's name instead of his whole chunk of text. It's also easier to see the links that way.

But that's just me.--Beat 05:04, 25 February 2010 (UTC)

  • I've always liked the fully-colored dialogue, but then again, I can't remember the last time I saw a colored-names-only transcript, so it's hard for me to say for sure. Llort 05:13, 25 February 2010 (UTC)
  • It's here and there. I just did it in Chris's PS3 menu fest. Tell me what you think.--Beat 05:26, 25 February 2010 (UTC)
  • I think I like full color more, but that may just be familiarity speaking. Colored names with black text certainly does make links more obvious. Neither way is right or wrong, and I guess my opinion is neutral. Llort 06:11, 25 February 2010 (UTC)
  • Key thing though should be 1) linking and 2) readability. Light green text on a white background is hard to read, dude. Yes, it's readable but it's not easy on the eyes. Furthermore, Chris as the same blue as links is kinda lame because then it's hard to see links and when I tell people to add links they bitch about "Well how are we suppose to see the links if Chris is dark blue?" There's nothing wrong with some color coding, my problem with this system is that it's putting aesthetics over wikiness and readability. The latter two should take precedence over the former. I think the best solution would be something along the lines of what beat proposes, that is color coded names and black text.--Champthom 16:54, 27 February 2010 (UTC)
  • I also prefer the full colour lines. It just makes it easier for me to follow the transcripts. But I do agree it's annoying that Chris is the same colour as links. Would it be better to just not give Chris a colour at all? He's Chris. The whole wiki is about him. I get that different characters need different colours, but Chris is Chris all the way through. I don't think anyone would mind if we just set Chris's (Chris? Chrises? Chris's? I hate apostrophes) colour as "black". As long as there's a clear line where the transcripts start and end, I don't think it would cause any trouble. Stickly 12:42, 1 March 2010 (UTC)
  • This is a good idea, but the problem is that people use black for "action" like for sighs and such. However, if a troll says something that needs an explanatory link, it'll look weird if, say, their text is green and all of a sudden there's a blue link. Beat's treatment of the [[[Father Call]], IMHO, seems to set up a good setup for a new sort of transcription style. --Champthom 18:37, 1 March 2010 (UTC)
  • I don't quite get what you mean by the "using black for action text is a problem" since in the Father Call you link to, all of the text is black and the "action" text is still black. But after reading through it, I've gotta say I prefer the colour names / black text system now. It looks more... proper, somehow. I'll put some work in later and change the Alec benson leary calls over to the "coloured names / black text" thing. I'll do them in one diff so they're easy to revert if anyone disagrees later. I'll try to make the use of action text more uniform too. Stickly 16:42, 2 March 2010 (UTC)
  • Okay I realised that just going through everything and changing it without checking with anyone first was kind of an asshole thing to do, so I just did the first one since Alec's colour was wrong and needed changing anyway. If people agree that it's better I'll do the rest. It went a lot faster than I thought it would because I forgot about the "find and replace" tool in word. Stickly 17:55, 2 March 2010 (UTC)

Segmenting transcripts.

I think that we need to consider splitting transcripts into subsections whenever the audio goes on for longer than 30 minutes. Most of the Alec Benson Leary calls are pretty long, some of the mumble chats go on forever, and the Father call goes past the 2 hour mark. There are a couple of transcripts that do this, such as Emily date and Mumble 1. I'm going to try and add similar breaks to some of the other very long transcripts, probably starting with the longer ALB calls. Someone let me know if I fuck up too badly.--Beat 22:31, 26 February 2010 (UTC)

  • I liked the way that Dkaien and I ended up handling Liquid and Kacey Call. (Although I never did get around to finishing all of my intended editing for that page.) Other methods may work just as well. Llort 01:50, 2 March 2010 (UTC)

March is Mumble Month

If there is one major task on the CWCki, it's to finish transcribing unfinished transcriptions. Yes, it's not easy to transcribe Mumble chats and stuff. However, given that we were able to get a two hour Father Call transcribed within a day of it being released is no excuse for having nearly year old Mumble chats still unfinished. These provide gold in terms of information about Chris and until they're transcribed, they're not being fully utilized.

I recall someone being like "But Champthom, why don't we just leave the summaries since no one will want to read the whole transcript?" To that I say BULLSHIT. The problem with summaries is that they essentially reflect the opinion of what is important of whoever summarizes them. They're still good to have if people are really lazy but if someone wants to read the transcript, they should be able to do so so they can draw their own conclusions about the given chat.

So I declare March to be Mumble transcription month, with the goal being to transcribe the Mumble chats by the end of the month. First project with be finishing transcribing Mumble 2. Those who contribute will get a SUPER DUPER SPECIAL user box for their page, I swear.

So get at it. --Champthom 09:16, 1 March 2010 (UTC)

  • And to prove I'm not kidding, check out this Medal I rigged in MS Paint:
Mumble Service Medal March Mumble Madness
This user nearly went mad in March 2010 to provide CWCki with accurate and complete transcriptions of the Mumble chats.
Yes friends, it can be yours if you contribute to transcribing the Mumble chats this month. --Champthom 09:33, 1 March 2010 (UTC)

I know it's not a Mumble, but the 90-minute audio recording of the Emily Date badly needs transcription help. Currently, only random segments are transcribed, covering about half of the full audio. There are still two holes in the middle (one 8 minutes long and one 40 minutes long) and a third hole near the end (only four minutes long). I'm much better at editing transcripts than I am at typing them up, so I'd appreciate it if someone else took the initiative on this one. I've already fixed all of the formatting and pointed out all of the gaps, so the road ahead is clear. TL;DR: PLZ TRANSCRIBE DIS BITCH!! Llort 04:57, 4 March 2010 (UTC)

  • As I said on the Bob Chandler page, thank you for bringing this to my attention, personally. I didn't notice that the entire date has not been transcribed when this is a crucial transcript which, for starters, has vital information about Bob. Mumble chats are important, but it is important we finish transcribing this.--Champthom 07:39, 5 March 2010 (UTC)

Let's call it "March Mumble Madness." --Munch 05:37, 4 March 2010 (UTC)

  • I like it. --Champthom 07:39, 5 March 2010 (UTC)

I'm working on mumble 8 at the moment. Can stop that for a moment and get onto the Emily Date or Mumble 2 if someone wants? EnglishPickle

  • Scrap that, have been working on Mumble 2 now. Should finish it in a couple of days. --EnglishPickle

And the results

Guys, I am proud of those who contributed to the project. Yes, we didn't get to all the Mumble chats but just because it's not March doesn't mean Mumble chats can't be transcribed still. Nonetheless, in a one month period we got a lot of work covered and I salute all of you who contributed. I need to go through the Mumble articles and see how much was contributed overall, and award medals to those who did.

Clyde believes we failed because we didn't do every Mumble chat but I disagree strongly. We still have more Mumble transcripts transcribed than before and hopefully we can work on transcribing all the chats. --Champthom 06:51, 5 April 2010 (UTC)

YouTube is gonna automatically caption videos

Srsly. Now there's like, no reason to transcribe stuff. All we really have to do then is transcribe stuff like released audio, and to smooth out any glitches from the voice recognition software that YouTube is gonna used. --Champthom 01:13, 6 March 2010 (UTC)

"any video that meets sound and technical quality standards" - I don't think any of Chris's videos will meet those. Some of the videos already auto transcribed are pretty bad, like this one --Digital 03:00, 6 March 2010 (UTC)
It'll make transcribing easier, sure, but I actually like reading the transcripts over watching the videos. The less I have to hear of Chris's grating voice, the better. I realize this may seem odd considering how many videos I've transcribed. My goal is to make sure nobody ever has to watch a Chris-Chan video twice.--Beat 04:14, 6 March 2010 (UTC)
Let's not get too excited, folks. Judging by that DPF video, it looks like YouTube's speech recognition software only manages about 20% accuracy, which is - needless to say - horrendous. What's worse: that video has better audio quality for the software to work from than do virtually all of Chris's videos and all of the leaked audio recordings. Therefore, human transcription will be needed for a very long time to come, chums. Llort 04:42, 6 March 2010 (UTC)
Plus Chris speaks so badly that any voice recognition program, even the best around, is going to have trouble understanding him. They require pretty clear diction even after you "train" them for a few months. Chris's speech impediments and stutters are going to throw any auto captioning into disarray. The results should be funny, though. Stickly 06:23, 6 March 2010 (UTC)
Yeah, speaking as someone who actually makes a living as a transcriptionist, the idea of automatic transcription systems makes me giggle my ass off. Someone who speaks the way Chris does, the machine is going to mangle all to hell and back. Automated captioning systems tend to provide a script that reads like someone with Tourette's having a stroke. Kazmeyer 06:56, 6 March 2010 (UTC)
Chris isn't all that (in)famous, anyways. I bet barely a million people have heard of him. Also, if his videos do get captions on Youtube, it may also boost his ego. --Marco 07:42, 6 March 2010 (UTC)
He probably won't know how to turn them on. If he does, he's probably gonna go into tard rage mode. --Munch 07:45, 6 March 2010 (UTC)
The automatic video transcriptions are now online. Hint: If the auto-transcription flubs Deth's fairly clear narration horribly, you can just imagine how horribly it flubs Chris's videos (the few that have HQ versions). Just check out Christian Love Day. --wwwwolf (wake me when you need me) 12:05, 6 March 2010 (UTC)
That was beyond hilarious.--Beat 15:14, 6 March 2010 (UTC)

Word cloud from transcripts of choice videos

I took some transcripts, stripped out the bracketed parts, and fed it into Wordle to see which words Chris prefers. Here's what I got back. --Munch 19:58, 6 March 2010 (UTC)

Notice the size of "uh".