2x21: A Roomba for Companionship

Stuart Langridge, Jono Bacon, and special guest presenter Alan Pope from the Ubuntu podcast and Canonical and Ubuntu (standing in for Jeremy while he tours Europe) present Bad Voltage, in which popey has a hundred job titles, we race to put out the show before the Ubuntu Podcast people do, and:

  • [00:02:20] In the news: Sony Japan announces a next generation "Aibo" personal canine robot, causing extreme Jono mockery for buying an early protoaibo years ago... everyone hates Perl, Delphi, and VBA, in more shocking reveals from Stack Overflow... Google Drive is locking some docs from sharing because they are "violations of the terms of service"... Github publish State of the Octoverse 2017, where there are 25 million active repositories, all of which are new JavaScript web frameworks... bloke says that nobody uses libraries and librarians are "sad people who can't get proper work", gets 110,000 replies vehemently disagreeing: watch out for librarians, they're tough... Man takes daughter to his work at Apple, lets daughter film new unreleased iPhone X and publish the video, gets fired, not all that surprisingly
  • [00:42:35] Ubuntu release the new 17.10 release, and we talk to Alan about it
  • . This leads into discussions of being data-driven when planning, what the deal with snaps are, and upstream relationships
  • And news on recent conferences: Jono at Open Source Summit Europe, Alan at Freenode #live, and Stuart at Hackference

Download from http://www.badvoltage.org/2x21/ now!

1 Like

That was fun, thanks @sil & @jonobacon for asking me to be on, and thanks to @jeremy for not being there :wink:

5 Likes

I enjoyed @popey on the show. It would be very nice to have him on more, or maybe even the fourth wheel on the BV vehicle? Just a thought! :slight_smile:

As to libraries, I am not able to buy books at will. So, I find the library useful to see if I want to make that purchase. It has saved me much over the years.

3 Likes

Regarding the “Google (accidentally) blocks documents” story:

I take issue with @sil 's description of this being “probably a bug”.
I’m not saying it’s not a bug, it might be.
But what’s more likely is that it’s (from the perspective of the google engineer in charge of this) a “misclassification”.
The immediate difference for the person on the other end of that piece of code might seem small, but conceptually it’s different.

When you fing a bug: you fix it, and the problem goes away.

When you have misclassifications: you can

  • get more data (e.g. this new and interesting data point we just got from an angry journalist who had her stuff blocked)
  • tweak your hyper-parameters
  • switch your learning algorithm.

None of which fundamentally solve the problem that a machine will make mistakes at the highly contextual task of deciding what is or isn’t fit for public consumption in terms of opinions expressed, visual content, fair use, … The questions that program should magically solve have kept lawyers and judges their salaries for years and years.

And the problem doesn’t stop there either. As you guys mentioned about youtube demonetization: these little “nudges” don’t necessarily prevent this work (video in the case of youtube) from being published. They really take away from the next work of the creator.

These learning systems will always fail most easily and most visibly in the fringes of what they see. It is not often that a google-docs content-bot gets to see a well-researched investigational piece on the dark web, so data in that part of its input space is relatively sparse and its estimations will therefore have more variance.
The little nudges will (nearly) always drive people more towards (what the computer perceives to be) The Norm ™.
We already have spheres where we can see what this leads to: airports. Any kind of “unnormal” behaviour in an airport can easily get you into trouble (more easily depending on what your looks…) and nowhere do people look as normal as there.

I’m curios @jonobacon why is iPhone 10 weird?

iPhone I, iPhone II, iPhone III, iPhone iV, iPhone V, iPhone VI, iPhone VII, IPhone VIII and iPhone iX are obviously well known products in the series. Wait! I might be wrong here.

Nobody uses Libraries: I disagree strongly here I still use libraries a lot. I don’t borrow books any-more but I use them for research and as a workspace.

If I am going to use a book a lot I’ll buy my own copy, either Kindle download or real book. But, the library is a good source to try a book and only buy if the book is actually going to be good for me either as a suitable reference, text book or piece of entertainment. This requires a good library and due to budget constraints lots are underfunded. My local libraries in Huntingdon and Southend-on-sea are but Cambridge is a good Library and last time I was in Birmingham that was a good library too, @sil please correct me if that has changed since I left the Midlands

Mr Alan Pope (aka @popey) great to here from you I love the Ubuntu UK podcast I would like to see you back here on BV again.

4 Likes

Precisely my point. :slight_smile:

Let me clarify here, and see if you agree. Allow me to specify a small example, which may help.

Imagine that Google have a classifier where they scan documents for “bad words”, and if there are more than 10 “bad words” in a document, that document is classified as a “bad document” and can’t be shared. (Leave aside whether this whole concept would be a bad idea to exist at all; assume for the moment that this is a legitimate thing to do and we all agree with it.) So the code to do this would look something like this:

bad_words = ["terrorism", "nazi", "explosion", "white power", "child porn"]
count_of_bad_words = this_document.occurrences(bad_words)
if count_of_bad_words > 10 then document.is_bad_document = true

I think that a misclassification would be adding something unjustified to the “bad words” list: for example, someone decides to suppress documents with the word “homosexual” in them, out of some misguided attempt to “protect” children.

I think a bug would be accidentally typoing the if line and writing 1 where they meant to write 10, so the code looks like this:

if count_of_bad_words > 1 then document.is_bad_document = true

The distinction, in my mind, is that a misclassification means that the suppressed documents are being correctly suppressed according to Google’s current rules – they want them suppressed – but their rules are too general or too inaccurate, and the way to fix this is to say “ah, we thought all documents that matched this ruleset were bad documents, but it turns out we were wrong about that, and so we are revising our ruleset”.
A bug, on the other hand, means that Google do not want the current documents suppressed, and the issue is not with the set of rules, but with the code that’s implementing them; the fix is to update the code so it does what was intended in the first place.

If I have your characterisation correct, then this means that you think that Google drew up their rules for suppressing documents incorrectly, and they nee to change their ruleset because it has unexpected consequences. Whereas I think that Google’s ruleset wouldn’t have suppressed these documents in the first place and the issue is that the ruleset isn’t being correctly followed. My reasons for believing this are, basically, that I’m giving them the benefit of the doubt. This may not be justified, and if it turns out that you’re right and they drew up a ruleset which suppressed innocent documents because they didn’t think it through enough, then I would not be wholly surprised.

I think we’re both in agreement that these documents are not being deliberately and explicitly suppressed and that’s because that’s exactly how Google want it, right? It is some measure of mistake, although whether a tactical or a strategic mistake is still undetermined.

1 Like

Yeah, you got what I meant. For me, an easy distinction between the two is “is it a source code update, or do I just push out a new .dat file?” This only works if bad_words = np.fromfile('bad_words.dat', sep=' ') :wink:

So yes, we agree this is very likely unintentional. That is, google wouldn’t want those documents to be blocked, but given their current model and the data their system has seen, that’s the best estimate they can come up with, so that’s what they’re gonna go with. But since it’s not a bug, it won’t be solved with a patch.

1 Like

side note: I just earned the “first emoji” badge, and I have to say, these overly smiley… smileys make me feel like I’m 15 or so. Whatever happened to the good old non-overly expressive :)

2 Likes

I think we should have a vote on keeping @popey as a show regular, and I don’t think he should have a say in it :wink:

5 Likes
2 Likes

“Nobody uses Libraries” What balderdash! What bunk! What poppycock! What [insert some other English’ism for b.s. here]! I too disagree strongly.

The library here on the Island just expanded (~x2) and at the Grand Opening more than 1000 people showed up. Given that the full time population barely breaks 4000 and the beer was NOT free (though the hotdogs were) I think that represents a pretty stunning refutation to the idea the nobody uses libraries. Frankly, we’re darn lucky the tourists don’t wander in (much) during the summer season since we could hardly cope.

3 Likes
  1. iPhone
  2. iPhone 3G
  3. iPhone 3GS
  4. iPhone 4
  5. iPhone 4 CDMA
  6. iPhone 4S
  7. iPhone 5
  8. iPhone 5C
  9. iPhone 5S
  10. iPhone 6 / 6 Plus
  11. iPhone 6S / 6S Plus
  12. iPhone SE
  13. iPhone 7 / 7 Plus
  14. iPhone 8 / 8 Plus
  15. iPhone X

Shouldn’t the iPhone SE be #11 in the list?

Our local library is all the things mentioned in the show and more: eg. a notice board (for some subset of local events, groups).
It’s a different selection of books on a topic than you would find on amazon first results page too - same as any other bookstore might be. More environmentally friendly than buying new books if they’re read-once-leave-on-shelf-forever - amazon/other used copy options not always economical compared to new (realistic choice based on cost difference and delivery cost+time, frequently from abroad/slow shipper).

PS Am going to ORG Con in London today/tomorrow, anyone else? That’s open rights group.

1 Like

@sil 2:46mins reminds me of noel’s services (he has a DVD-pack on amazon for fiddy dollars ), although I maybe stretching the concept somewhat; but watch the dvd pack if that’s your business.
Where’s the skivala link ? by the way, I am sure this has been mentioned before, years ago.

@popey 52mins just do a poll on reddit & ycombinator about 17.10, like you did before, is my suggestion. You could raise the topic on Late Night Linux (maybe cool) , although I imagine they shall query , mostly, the format of the poll, which is not the nature of the query (you could just do it in HTML5),

@oldgeek as far as LBRY’s they are certainly not for chromebooks - as my 16 GB hard-drive quickly became full. but I read the reddit once a week.

@breezer no idea what org-con is as I have never seen them, but why not talk (private_message) to @Sarah_Scarlett as she uses an interesting subreddit for that kinda stuff. hope you get your coffee, pal.

Also, https://imgur.com/t/library_porn/d8NyL

1 Like

Fortunately, mine has a micro sd slot.

1 Like

I’ve never been to that subreddit, it’s super interesting though so thanks for the link! :slight_smile: I think you may have me confused with someone else, and I’m well aware there is someone else online using my name. I’ll reserve comment but I’ll just say that I have a GNU logo tattoo,so if there’s no GNU please rest assured it’s not me.
tumblr_lu3tcpORu31r5itxro1_500[1]

Also super interesting and cool are libraries! However, I do not use them because I’m not very mindful and my books that usually end up looking like this getting crammed in my backpack when I’ve once again almost missed my subway stop, etc. etc.

A while ago I mentioned @sil may want to pm me regarding getting free amazon gift cards. I only requested pm out of respect because there’s a referral residual schedule he could have taken advantage of but it was no secret or anything otherwise. I use swagbucks (just the videos with a couple old android devices and an ancient iphone, I’d stay away from the offers etc. as they look pretty scammy) and that keeps me in amazon gift cards to buy books, etc. :slight_smile:

1 Like

Why?