Posts that are teaching open source-ish
Karsten and I task-swapped last week – he’s been driving Fedora Insight like a storm, and I took the Summer Coding SIG… and stared at my screen for days. When I don’t understand the global picture, I have a hard time diving into the tactics that need to be executed, and instead freeze up and babble aimlessly – I finally realize that the next step was to draw out the global picture, and whipped up the video below in a 30-minute whiteboard frenzy this morning.
What we’re trying to do is a generalization – think Google Summer of Code with the “Google” replaced by “any funding organization,” and “Code” replaced by any deliverable important to an open source project. What’s the underlying model here? How could, say, the Uncle Pennybags Summer of Test Plans build on this framework?
Karsten later told me this is called a “swim lane chart” – apparently I reinvented the wheel, thinking that I’d come up with the idea of sorting flowcharts into columns by role. ;-)
Now, I’ll put out the disclaimer that this comes from my brain aggregating everything I’ve seen and heard on Summer of Code and similar programs, and that there may be bugs or missing bits in the workflow I’m describing. Part of the reason I’m throwing this video out there is to get an idea of where I’m wrong. I’m sure I’m wrong in many, many places – if a patch needs to be made, please say so!
If this sounds about right, we should Inkscape up the flow chart and type out the narration and put that document up… somewhere. Any takers, or ideas where the upstream ought to be? I’m putting it in the Summer Coding SIG by default, but am happy to migrate it somewhere if there’s a better upstream to be found.
Thursday, March 4th, 2010 | fedora, teaching open source | No Comments »
Had a little fun today while talking with Diana about her research. She was explaining how it’s hard for her to figure out who’s an active Fedora contributor since there are so many ways and means and places (git, wiki, lists, etc) to contribute and everyone contributes in a different place (someone may maintain a lot of packages but never blog, another person may blog and never touch the wiki, etc), so I pointed out that just about everything in Fedoraland was a website with FAS authentication, so “fire up twill, scrape ‘em all down, do some text processing, and you’ll have a per-user portfolio you can analyze to get an activity count.”
8 minutes of coding and 29 minutes of documenting later, a quick and dirty solution prototype is up in the form of FAS scraper. It takes a list of FAS usernames and makes little portfolios for each user using his/her recent activity from a variety of services (so far, just wiki recent changes and packages-maintained). This isn’t meant to prototype the architecture of the code (this code basically has no architecture, it’s 11 lines long), it’s meant to be a rough demo of desired functionality. Think about making the user-portfolios themselves more query-able and you’ll have a notion of how this could be extended – it would be neat to run queries like:
- How many people who blogged on Planet at least twice a month for the last 6 months are also frequent wiki editors?
- Show me all the users who maintain at least 10 packages and are also members of the syadmin-test FAS group?
- For each user with recent wiki edits to the F13 Talking Points page on the wiki, give me all their emails to the advisory-board list in the past month.
I’m sure you can think of better ones. For students (and professors), I actually think this might make a neat variant of a problem set I was once given in college.
To run it, you’ll need the python and python-twill packages (that’s what they’re called in Fedora, dunno about other distros), so this is what I think most people reading this will want to do, for easy copy-pasting into a shell script or a terminal.
sudo yum install python python-twill
mkdir fasscraper
cd fasscraper
wget http://mchua.fedorapeople.org/FAS_scraper/FAS_scraper.py
python FAS_scraper.py
You’ll get a directory full of output that looks like this. They look reasonably pretty on account of they’re straight scrapes of the html pages. This is the sort of thing I pull up to show students when I speak about what a “FOSS portfolio” looks like, so I might just use it to quickly autogenerate “portfolio pages” for the folks I’m introducing them to on IRC. And yes, I realize some of these services are likely to have better APIs to interface with than “scrape the webpage, then consider parsing the html in later versions.” I do not know what they are or where they are.
I am, lines-of-code-per-unit-time-wise, one of the slowest programmers I know, because my docs-per-line-of-code ratio is ridiculous. It’s an old habit that comes from writing APIs usable by mechanical engineers. Fighting the temptation to rewrite. Will probably cave at some point and do a more proper version than the kludge that’s up now, but… if someone takes it off my hands before then, I’ll rest easier and not stay up all night fiddling with this.
In terms of moving this forward, what actually needs to happen is for this to be re-architected into a good general-purpose python library for getting data from FAS-authenticated services. Do things like “instead of manually defining the list of FAS usernames in the code, grab the list of usernames from the actual FAS system.”
Any takers? The first thing to do, methinks, is to get this baby under version control.
Monday, March 1st, 2010 | fedora, olin, teaching open source | 2 Comments »
From Luis Villa’s post on wiki usage in law school classes:
In my experience, wiki writing- whether the goal is inclusion in Wikipedia or not- really should be part of the law school curriculum. It is better than traditional papers for teaching basic research and scholarship, and if done well, can also teach collaboration, editing, and other writing skills. There is still a lot to learn about the ‘done well’ part, but I hope Prof. Goldman and others continue to experiment with it. They’re doing the right thing even if their students don’t realize it yet :)
Worth reading. Also worth reading for engineering students frustrated and befuddled by those annoying “contract” things: what writing a contract feels like, explained in coding metaphors (analogy actually from Alex Macgillivray, but following Luis’s blog is what led me there):
“To put it in computer terms, imagine the contract as a computer program. In each the object is to be able to interpret the words and have that interpretation drive a result. Now imagine that there is no compiler for your program and that you can’t run any tests. All debugging must be done only theoretically and in your head.”
Wednesday, February 24th, 2010 | teaching open source | No Comments »
Fedora design guru Mo Duffy gave a presentation on Fedora at RPI, her alma mater. Slides are available from the post, and as usual, they’re gorgeous. I’m reblogging here since Mo isn’t part of Planet TOS and I thought some professors might be interested in the slide deck.
Mo’s deck does a good job of addressing a problem I often have when speaking with a group of students (or for that matter, a group of people with a wide range of backgrounds in and exposure to FOSS). How do you describe such a huge and complex space (“The Fedora Project”) without either oversimplifying things or getting bogged down in details for hours? Her presentation sketches out the broad outlines of a space (“look at the wide range of things you can do in Fedora!”) and then show examples of how to dive into details in a few of them (specific projects in specific teams).
Anyhow. There’s a Rensselaer Center for Open Source Software – this is a page I hadn’t seen beforehand, and which has some projects that seem to overlap with projects other schools in the TOS space are doing. For instance. OLPC Math (yes, RIT, I’m looking at you) or multi-touch (which is, if I recall correctly, how Seneca got started with Mozilla back in the day).
Synergy time? Luis Ibanez and Mukkai Krishnamoorthy are TOS-ers from RPI – looking at the list of TOS-ers also turns up Will Schroeder. RCOSS has a nice (and pretty darn up-to-date!) portal page, with projects – they have great talks going on, they have courses being taught… and I’d love to learn more about them, and what other schools can learn from what they’ve done. RPI folks, what’s happening in your neck of the woods these days?
Monday, February 22nd, 2010 | teaching open source | No Comments »
Spent the day at RIT (thanks to Karlie and Todd Robinson and their family for graciously hosting me!) working with Steve Jacobs and Remy DeCausemaker on POSSE (we’re on for June 14-18 with Chris Tyler coming down from Toronto to be one of our instructors) and RIT’s teaching-open-source-fu. As part of this, Remy and I spent the evening at Computer Science House (CSH) talking with students about how Getting An Open Source Job/Internship/Co-op Works. It was one of the best such sessions I have been to; Remy and I are both recent-enough grads to remember our student-brains and bridge a little bit of our current world (FOSS) back to our old one (school).
It was done completely off-the-cuff, but I now have a 10-slide set of slides that I think will replicate the important parts sitting as pages in my notebook. At some point I should transcribe and post them. Nagging me may help accelerate this process. ;-) This, however, was the most important section (in my mind). Here’s what we’re taught as kids. Or at least what I was taught as a kid.
In My Parents’ World – How to Do Stuff (answer: “Get A Job according to these Complex Procedures”)
- Be interested in something
- Study (very, very, very) hard
- Get (very, very, very) good grades
- Make a resume
- Buy suit
- Apply to jobs
- Get introduced to recruiter
- Get interviewed
- Get hired
- Do Stuff
Except that the “Do Stuff” might be… well, entry-level job, filing stuff, doing thankless gruntwork waiting ’till you could move up the ladder and really Do Stuff, you know…
In The FOSS World – How to Do Stuff (answer: “Do Stuff”)
- Be interested in something
- Do Stuff
- Get hired
So here’s the thing: to your employer (according to my parents’ world), steps 2-8 in the first list are filters for the “getting hired” step. Hypothetically, your future boss can use things like grades, cover letters, etc. to figure out who’s going to add value to the project. And… all right, maybe they kinda work. And that’s about as good as you can get when Getting Hired is a prerequisite to Doing Stuff – they have to hire you before they give you access to the lab, the mailing list, the hardware, the code, and so forth.
But in FOSS-land, Getting Hired is not a prerequisite to Doing Stuff. In fact, it’s the other way around. Doing Stuff is the filter. Your future boss is waiting for you to give him/her an excuse to hire you. Not only hire you, but hire you to do something you’re already so interested in that you’re doing it for fun in your free time, and do it better. And this is not restricted to open source companies – banks need sysadmins, hospitals need programmers, robotics labs need UI designers… if you ask the question “could open source benefit this place?” and the answer is yes (and it should basically always be yes), then seriously consider offering to be the one to Bring It There.
Now, folks who’ve been around open source for a while will think this is obvious. But to me a couple years ago, coming squarely from the mindset I was taught as a kid, this blew my mind. In FOSS-land, we toss around “oh, just Do Something” as if it were the most obvious thing in the world – but it isn’t. See, steps 2-8 in the first list serve as a filter for potential employers, but they also serve as scaffolding to first-time jobhunters. You’ve never been through this process before, where do you start, how do you figure out what people value? Well, you make a resume according to these recommendations…
So if we’re replacing and reordering a whole chunk o’ steps with an extremely nonspecific instruction to “Do Something,” we’d better elaborate a little more on what that is and how you start going about it.
That’s what the rest of the evening was about.
Tuesday, February 16th, 2010 | fedora, teaching open source | 4 Comments »
Dear students involved in Fedora, Sugar Labs, OLPC, and/or any other open source project:
Please take a minute to explain to people with fancy titles why you are awesome.
That is all. I hope to see some of you in Boston in April spreadin’ the good news.
Thursday, February 11th, 2010 | fedora, olin, olpc, sugar, teaching open source | No Comments »
For those in academia: I’m wondering whether this might be the kind of mini research problem (or even non-mini research problem) that university students might be excited about picking up.
From Matt Domsch’s post to the Fedora advisory board list:
I’ve spent quite a bit of time over the last week fixing up the scripts that generate Fedora’s worldwide user maps including the worldwide map for all Fedora versions currently in use as determined by yum requests for mirrorlists… we currently have no way to know, within even a 2-4x margin of error, how many current installs of Fedora there are. But this number, and it’s growth (positive, or negative), would be interesting to know, if only it were more accurate.
Now… it seems to me that this is an interesting technology-related problem requiring technical fluency, development of keen analysis skills, and with immediate potential real-world (i.e. would-actually-affect-the-work-of-thousands-of-people) implications. And it also seems to me like it would fit nicely into the format of an independent study or a small thesis – actually, my first reaction (before starting to think about potential solutions) was “huh, this is something I could work on, write a paper about, and turn in for school, if I were still in school.”
Thoughts? Would these sorts of “huh, we wonder how we’re doing on this front?” introspective semi-research-ish questions coming from open source communities make good material for student projects (independent research, class projects, whatever format it would fit in)? If so, how could we help students get started on them? If not, why not – and what sorts of problems would be better?
Thursday, February 4th, 2010 | fedora, teaching open source | No Comments »
We just finished a big last-minute sprint in the #teachingopensource IRC channel to push our OSCON track proposal through.
Here are the five I’m listed for:
- 5 FOSS in Edu projects that changed the world (Karsten Wade, Mel Chua)
- Junior Jobs and Bite-sized bugs: Entry points for new contributors to open source (Mel Chua, Asheesh Laroia)
- Teaching FOSS to Undergraduates (Ralph Morelli, Luis Ibanez, Carlos Jensen, Mel Chua, Timothy Budd, Leslie Hawthorn)
- Teaching Open Source Has A POSSE (Professors Open Source Summer Experience) (Mel Chua)
- The Open Source Way: Leveraging Communities (Sebastian Dziallas, Mel Chua)
As a first-time proposer, I was amazed to see how quickly things came together, snowballing right up until the very end. A few minutes before the deadline, proposals were still being written – for the one I was in, Asheesh was filling in the outline Karsten had set down while Jeff caught typos and I copy-pasted the finished bits into the OSCON proposal submission page. We got it in right under the deadline. (In other news, etherpad kicks some serious butt.)
I’m hopeful that our track (and at least some of our talks) will be accepted – I definitely want to see… oh, pretty much all the talks listed. July should be a fun month for travel this summer.
In other news, the road trip back to Boston (from Raleigh) continues. I’ve made it to New York, staying with Sumana and Leonard. Many thanks to Paul and his family for putting me up last night (complete with affectionate dog resulting in my pants still having dog hair on them – but I’m happy about this, since my opportunities to get covered in dog hair are far less frequent than I would like them to be).
It is nearly 4am, which means unconsciousness shall claim me shortly. Farewell to the waking world for a few hours.
Tuesday, February 2nd, 2010 | fedora, teaching open source | No Comments »
Taught this system in about 15 minutes to my cousin Melanie (high school freshman working on her first Big Paper for history; it’s something about People with Swords in Ancient China, so I’m totally reading it when she’s done) this afternoon and realized I’d never written it down, so here goes: this is how I’ve taken reading notes and written papers since I was in high school. I’m also writing this in part to prove that the terminal is useful for things other than writing code, because I did not know how to code when I started doing this.
My system is largely predicated on the assumption that I am a Lazy Bum, and basically involves 4 tools: cat, grep, | (pipes), and flat text files. These are standard Unix tools, and I’ve never seen a Linux distro without them; Melanie and I already run Fedora, so we were all set.
I grab the text of books when possible (mm, Project Gutenberg) and take advantage of the fact that my computer can read faster than I can. For instance, for history my Junior year of high school, I had to write some paper about the Judeo-Christian belief system. I forget the exact topic now, but let’s imagine wanted to grab out some nice quotes about the symbolic use of… say, swords. I like swords. So I download bible.txt, and…
cat king-james-bible.txt| grep -C 1 sword | less
In English, this means “send (concatenate) the text of the bible through a filter (global regular expression print) that looks for the word ’sword’ and shows the -Context of 1 line before and after it, then let me scroll through the results (less).” The results look something like this.
01:003:024 So he drove out the man; and he placed at the east of the
garden of Eden Cherubims, and a flaming sword which turned
every way, to keep the way of the tree of life.
–
01:027:040 And by thy sword shalt thou live, and shalt serve thy brother;
and it shall come to pass when thou shalt have the dominion,
–
stolen away unawares to me, and carried away my daughters, as
captives taken with the sword?
–
that two of the sons of Jacob, Simeon and Levi, Dinah’s
brethren, took each man his sword, and came upon the city
boldly, and slew all the males.
And so on. Instant sworditude, much faster than actually reading the whole darn book (or Book, in this case).
For those looking for a more powerful alternative to grep, try ack. (The website URL is pretty accurate.) I was introduced to ack at TOPP and have never looked back; the main advantage is how easy it is to deploy ack on huge trees of folders swarming with text (or code) files, meaning that you could, instead of just looking in the King James Bible, deploy the above search for swords in every note you’ve ever taken on every book you’ve ever read. Assuming those notes are textfiles dumped somewhere underneath the folder you’re searching in, I mean. It’s made fascinating connections between long-ago reads I never would have thought of on my own, and my papers in college were much improved by it.
I also take my reading notes in flat text files as I go through books. Those textfiles look something like this:
Arnold, Bennett. How to Live on 24 Hours a Day. New York: Shambling Gate, 2000. Print.
P: (5) Lay out things for tea at night so you can make tea in the morning as a nice wake-up call.
Q: (5) [breakfast] The proper, wise balancing of one’s whole life may depend upon the feasibility of a cup of tea at an unusual hour.
N: Hilarious writing style. Read this book whenever the need for British wit strikes.
?: (7) Was this before or after Taylorism?
N: (7-8) This program would only work in a highly literate population. Which I suppose the reader belongs to, as they’re reading the book. But still.
Note a couple things.
- Full bibliography at the top so I never have to figure out the formatting for it again.
- Each note gets a new line, and begins with a letter code for the type of note it is: P for paraphrasing (summary), Q for quote, ? for a question I have, N for a note (my own thoughts), and some not shown here, like R for “reference to some other material I should look up later” (such as when one book cites another that I figure I should read).
- Optionally, page numbers appear in (parentheses) immediately after the note type.
- Super-optionally, tags appear in [brackets] after the page numbers, mostly when I want to be able to associate a quote with a word that’s not in the quote, for ease of searching later.
Then I can make queries like “what were all the questions I had about this book?”
cat how-to-live-on-24-hours-a-day.txt | grep ?:
Or “what interesting stuff was on page 7?”
cat how-to-live-on-24-hours-a-day.txt | grep (7
And so forth.
Confession: I’ve fallen off the wagon and haven’t taken notes like this since I left school. I’m trying to climb back on it again, as this sort of database is gloriously helpful to build. Particularly if one plans on doing lots of reading and writing of papers. Like, say, if one were to consider grad school.
I’m sure this system could be improved; I once had dreams of writing a GUI for it, but found this worksforme enough that I just never made one. There are probably better tools out there for it, there’s probably a lot of regexp-fu I could pick up to do more powerful queries (in fact, this is one of the reasons why I know regular expressions at all), there’s… well, you know what I’m about to say.
Patches welcome!
Sunday, December 20th, 2009 | fedora, teaching open source | 10 Comments »
While reading Matt Jadud’s blog via Planet TOS, I came across the blog of Cory. Cory is one of the students working with Matt on Operation: Stick Figure Army. His recent post “how to cope with the design phase?” was about how he (as a blind hacker) goes about a process that most engineers rely on highly visual tools for (UML, sketching on whiteboards, etc). I can’t comment on his blog without a login I’m not sure of how to get, so I’m writing a longer blog entry as a response instead. (For those who don’t already know, I’m a deaf* hacker.)
First of all, I like the way Cory and his professor handled the question on how to assess his understanding of UML diagrams (a visual convention for describing program structure and a required topic in a class Cory is taking). He has to demonstrate understanding of the concepts; it’s just that the input and output methods for that understanding are different.
…even though I may not be drawing diagrams, that doesn’t mean that I’m not responsible for knowing how each diagram is used and how to describe one.
Reading Cory’s description of how he describes UML diagrams in text reminded me of the time in elementary school where my music class was going through the instruments of the orchestra; we were listening to sound clips from different instruments and had to write about each one. Since I can’t hear high frequencies, my reports went something like this: “The tuba sounds like this, the bassoon sounds like that, the piccolo has a fascinating history and an intricate key mechanism that I will now diagram…”
I don’t believe that a fundamental property of the software development cycle is that it is visual. I think we make it that way because most people think it is more convenient…
I agree. And I don’t believe that a fundamental property of high-bandwidth conversation is that it’s auditory, either. I know many people who, at the present moment, find phone conversations to be the easiest way for them to communicate with others long-distance. But that’s different from saying phone conversations are the most effective way of doing so, depending on your goals (for instance, phone conversations currently – usually – don’t get logged for posterity, let alone logged in a way that can be automatically translated). Similarly, there are undoubtedly highly effective non-visual ways of doing design. As someone who’s highly visual myself, I don’t know what they are, but I would love to learn. (One of the reasons I enjoyed reading Cory’s posts is that my hearing forces me to rely so heavily on visual input that I often forget to run thought experiments suspending the assumption that I can.)
I’d actually like to learn more about the design practice of looking at edge-case users (not sure if there’s a better term for this). Maybe posts like Cory’s can shed some insight into the advantages of non-visual design systems, or the disadvantages of visual design systems, in a way that makes both of them better for everyone (not just the visually impaired). I look a lot at the benefits of alternatives to auditory-by-default systems because I have to, and sometimes the adjustments I make end up being useful to other people.
Matt pointed out in his response to Cory’s post that the majority of the software development world does not use visual input either.
…the overwhelming majority of our communication and collaboration regarding software developing is written/verbal, not visual. That is, we’re not shipping pictures back-and-forth 24/7—we’re chatting on mailing lists, IRC, and blogs to get things done.
However, I do wonder whether the dominance of text-based communications in software development will continue as tools like inkboard (collaborative Inkscape) continue to be developed, or if an (initially secondary – possibly only within a subculture at first) alternative, more graphical/auditory discourse will start happening. The parallel for me is podcasting and vlogging. They haven’t replaced forums and mailing lists in general online discussions yet, but they are definitely a presence that I grow increasingly more disadvantaged for having to ignore.
Well, mostly ignore. Strictly speaking, I do have the advantage that I can catch some audio, and that I have friends who’ll sometimes take the time to write a video summary for me, or sit next to me while a podcast is running and re-mouth the words so I can lipread them, but for most practical purposes, that’s like saying that publishing documentation in Tamil should be entirely sufficient for English speakers because of the presence of Google Translate. It takes a lot of extra conscious effort, the availability of specific tools and helpers, a lot of extra time, and much is still lost in translation, so it’s usually not worth the investment to even try.
I find it fascinating to see how other people adjust and hack inclusion into a world that often doesn’t assume them in its default case. At least with open source I get to hack on things – and with things – that give me the freedom to shape them into what I need (yay visual system beeps!) but the burden’s still on me to do the shaping and the constant reminding of others that I need accessibility to the things they’d like me to contribute to (for instance, project meetings by phone virtually guarantee my silence). At least the burden here comes with the tools I need in order to assume it. (Mostly. We could do better, but that’s a longer post.) And I’m glad projects like Sugar try to make themselves more-hackable-by-default.
*re: “deaf” – I’m trying to get used to being able to use this word as well, though I can hear some sounds (my hearing loss is classified as “severe”) and grew up mainstreamed in the hearing world (with lots of hacks). It’s a cultural adjustment that I’m consciously learning (with tremendous latency and deep discomfort) to make.
Thursday, November 19th, 2009 | fedora, olin, sugar, teaching open source | 1 Comment »