Apr
23
Posted on 23-04-2008
Filed Under (Blogging, Microsoft, Social Media) by italovignoli on 23-04-2008

For the Italian law I am a freelance journalist, as I am a member of the Italian association and I have also subscribed for 2008.
For a small percentage of the community - in the area of marketing communications - I am a blogger, without ambitions but with a small group of followers.
For Microsoft I am neither one, or even worse - as this is really worse - I am a blogger only when I wite something that they do not like.
Otherwise, I am a “business”, as you can see from the badge I got today at the MIX event in Milan, which did not allow me neither to assist to Steve Ballmer’s press conference nor to attend the following cocktail lunch.
microsoft-badge.jpg

I was with a group of friends - journalists and bloggers - with whom I shared some Web 2.0 laughs… I suspect that Microsoft should reconsider its concept of “conversation”, otherwise monologues will - at the end - turn into a soliloquy.

Technorati Tags:
, , ,

Tags: ,
Mar
03
Posted on 03-03-2008
Filed Under (Interoperability, Open Document Format) by italovignoli on 03-03-2008

I’ve tried to understand what has happened in Geneva during the Ballot Resolution Meeting, a very important step during the Fast Track process for the standardization of the Office Open XML document format.

I’ve gone through dozens of posts with the clear feeling that the report was biased, either in a sense or the other. Unfortunately, the voting process doesn’t help at all in confusedunderstanding, as the effects of abstention - for instance - can be different from the norm, as they may express a vote.

At the end, I’ve found only two posts which are worth reading, as they try to be balanced in their opinion, although they’ve been written by people against OOXML: Tim Bray (Canada) and Yoon Kit (Malaysia). They try to give a feeling of the work done by delegates during the BRM, and of the short time available to achieve the huge task of going through over 1.000 comments.

They both underline their negative opinion on the Fast Track process in relation to a document format with a description of the size of OOXML, which is a whopping 6.000 pages (you can even find pictures of the printout).

I’ve decided to avoid linking the biased posts, which can be easily found Googling “BRM Geneva” or “BRM OOXML”. You can find the entire spectrum of marketing hype, from “it was an unbelievable success” to “it was a complete disaster”, and you can get a sense of the commercial interests behind document standards.

Technorati Tags: ,,
Tags: ,
Feb
28
Posted on 28-02-2008
Filed Under (Open Document Format) by italovignoli on 28-02-2008

Lorem ipsum dolor sit amet, consectetuer adipiscing elit.

Fusce iaculis libero sit amet lacus. Sed sed neque. Duis aliquet. Aliquam quis est. Aenean pulvinar, nibh sed dapibus congue, arcu enim eleifend tellus, in blandit augue magna eu tortor. Curabitur auctor eleifend orci.

Cras justo. Vivamus pulvinar convallis nisl. Cras sagittis orci sed mi. In hac habitasse platea dictumst. Mauris nec massa. Integer metus. Nunc euismod pede vel lectus. Ut eget urna congue arcu vehicula placerat. Cras pulvinar. Quisque ut augue. Duis ullamcorper sollicitudin arcu.

Donec tincidunt semper ligula. Mauris ac pede vel neque vulputate euismod. Mauris scelerisque ipsum a massa. Morbi mollis, nulla non aliquet consectetuer, tellus nunc facilisis nulla, non faucibus metus erat vel magna. Phasellus pellentesque eros sit amet erat.

Phasellus aliquam orci. Duis varius neque condimentum sem. Quisque condimentum nibh vitae tellus. Proin ac enim eu urna ullamcorper suscipit. Curabitur gravida sem non magna. Morbi dignissim scelerisque enim. Nulla porta nibh ut felis. Mauris commodo lorem vel elit.

Tags: , ,
Feb
26
Posted on 26-02-2008
Filed Under (News, OpenOffice.org) by italovignoli on 26-02-2008

Orvieto has submitted a proposal to organize the OpenOffice.org Conference in 2008.

I quote John McCreesh:

I believe one European bid this year stands above the others, which is the bid from Italy. I believe the combination of an experienced organising team, a delightful warm location, and a thriving local community would be hard to beat. I would urge anyone wanting an OOoCon in Europe this year to unite behind Orvieto.

Voting is open to all individuals who were registered as members of the Community on January 1st 2008. If you happen to be one of them, please go to this page and pay your duty.

Technorati Tags:
Tags: ,
Feb
19
Posted on 19-02-2008
Filed Under (Readings) by admin on 19-02-2008

Mike Stonebraker has now responded to the second post in my five-part database diversity series. Takeaways and rejoinders include:

I obviously wasn’t clear when I talked about two major competitive relational challenges to Oracle, et al. I simply was referring to

  1. Mid-range relational DBMS and
  2. High-end analytic DBMS

Earlier I thought Mike was forgetting about the distinction between high-end and mid-range RDBMS. Naturally, that didn’t last long. He’s actually calling the mid-range systems “open source”, but that’s a decent first approximation to a hard-to-define category.

My real reservations about Mike’s post lie in the area of analytic DBMS. Mike points out that there are two kinds — row-based (which he thinks are destined to be obsoleted) and column-based (which he thinks are destined over time to run “the vast majority of analytic workloads”). Now, his predictions may eventually come true. But row stores dominate the specialty data warehouse DBMS market today.

Wha’s more, some major use cases such as data mining or on-the-fly scoring look inherently row-centric to me. Also, consider website personalization. It calls for pinpoint data lookup, integrated with analytics. Will that eventually be done by beefed-up OLTP systems? Stream processors? Column stores? Analytic row stores? None of the possibilities can yet be ruled out. Indeed, I’m not sure we can even make a good start on predicting the ultimate answer unless we first figure out what will be done in RAM, and what will continue to be driven from disk.

Speaking of assumptions, there’s a major sub-text coloring all these discussions. Stonebraker is on record claiming that a vast majority of data warehouses (he uses figures up to 99%) have or should have single-fact-table schemas. Indeed, Mike’s columnar product Vertica hasn’t yet been enhanced to handle anything but the single fact table scenario. While that certainly fits a lot of applications, it also leaves a lot out. Profitability calculations like those Kalido specializes in will have one fact table for revenue, but others for costs or margin deductions. Marketing warehouses might have one fact table each for fundamentally different kinds of customer contact (web, phone, etc.), plus one for actual transactions, plus one for external data.

This may ultimately be a distinction without a difference, in that a system well designed for 1 fact table will also do a good job on N fact tables, as long as the N tables have a shared key (e.g., customer ID) that can be used to simultaneously partition them. But it illustrates that columnar systems haven’t proved their eventual dominance quite yet. And if we’re looking at current and near-future use, row-based specialty data warehouse systems still have a huge role to play.

The database diversity series so far

Please subscribe to our feed!

 

Tags: , ,
Feb
18
Posted on 18-02-2008
Filed Under (Readings) by admin on 18-02-2008

I recently caught up with ParAccel’s CTO Barry Zane and Marketing VP Kim Stanick for a long technical discussion, which they have graciously continued by email. It would be impolitic in the extreme to comment on what led up to that. Let’s just note that many things I’ve previously written about ParAccel are now inoperative, and go straight to the highlights.

  • ParAccel sells a columnar, disk-centric data warehouse DBMS. Similar but not identical data structures are used in RAM cache and on disk. If there’s enough RAM, ParAccel’s system runs entirely in memory, except to the extent it obviously doesn’t (e.g., transaction persistence). In its TPC-H benchmarks and in some customer situations, ParAccel has run entirely in memory.
  • ParAccel initially stores updates (whether transactional or bulk load) in cache. At transaction commit time, or when the cache fills, changed blocks are stored on disk. Thus, as in most other DBMS, it is necessary to read a block into memory in the first place before you change it.
  • One ParAccel option is “Amigo” mode, in which the ParAccel database is continually synchronized with a SQL Server database, and queries are dynamically routed to one the two systems. (There’s no true federation at this time.) Each resynchronization starts with a new SQL Server query, at a scheduled interval. This interval can be as low as 5 seconds or as high as 10-20 minutes. Barry thinks the overhead of the resulting updates is “noise level” if the interval is 30 seconds or higher.
  • Writing a row or reasonably small group of rows in a table with C columns requires C writes to disk, versus the 1 write required in a row-based system. (For a sufficiently large bulk load, of course, that wouldn’t be true. Consider the extreme example in which the whole database is loaded. Then the number of blocks written is the same no matter what architecture you have, except for the differences caused by compression, by any indexes you store on disk, and so on.)
  • While single-record inserts are much slower than in row-based systems, Barry thinks that performance sacrifices are minor if rows are loaded a few thousand at a time or more. (I believe that in this and similar estimates he assumes the number of columns to be no more than a few dozen. While accurate for most applications, that might not be true for users who manipulate 1000+ column credit records.)
  • ParAccel claims strong SQL Server compatibility, including running TSQL stored procedures (but not other stored procedure languages, Postgres PGPLSQL excepted). However, while the SQL execution itself is parallel, the rest of the stored procedure only executes on a single “leader” node.
  • Oracle/PSQL compatibility is a roadmap item.
  • ParAccel supports C/C++ UDFs (User Defined Functions). Scalar UDFs execute in parallel. However, a UDF that invokes SQL runs only on the leader node – except, of course, for the SQL part itself.
  • In Amigo mode, ParAccel of course runs the same schema as the OLTP SQL Server instance it’s synchronizing with. Thus, they in no way make the Vertica assumption that all data warehouses have star or snowflake schemas. Nor do they replicate fact tables between nodes. Barry claims that ParAccel has done a great job on internode transport speeds, but the details are confidential.
  • Even more confidential is support for another claim of Barry’s. Just as columnar systems are slow when writing whole rows, they also are slow when retrieving them. But ParAccel has a deeply-secret way of greatly reducing this penalty.
  • Like Vertica, ParAccel supports limited materialized views, called “projections.” A major use of these is to store columns in multiple sort orders.

Related linkDatabase management choices – relational data warehouse

Please subscribe to our feed!

Tags: ,
Feb
15
Posted on 15-02-2008
Filed Under (Readings) by admin on 15-02-2008

This is the fifth of a five-part series on database management system choices. For the first post in the series, please click here.

Relational database management systems have three essential elements:

  1. Rows and columns. Theoretically, rows and columns may be inessential to the relational model. But in reality, they are built into the design of every real-world relational product. If you don’t have rows and columns, you’re not using the product to do what it was well-designed for.
  2. Predicate logic. Theoretically, everything can be fitted into a predicate Procrustean bed. But if you’re looking for relevancy rankings on a text search, binary logic is a highly convoluted way to get them.
  3. Fixed schemas. Database theorists commonly assume that databases have fixed schemas. If this means that 90%+ of all information is null or missing, they have elegant ways of dealing with that. Even so, as computing gets ever more concerned with individuals — each with his/her/its unique “profile(s)” — fixed schemas get ever harder to maintain.

If any of these three elements is missing or inappropriate, then a traditional relational database management system may not be the best choice.

More and more, it may be the case that the best logical data structure for your application isn’t entirely rows and columns. To be sure, almost every application has some alphanumeric aspects – e.g., the metadata associated with text documents, images, etc. But when you’re dealing with text, multimedia, geographic location, or nontrivial graphs, it’s a good bet that rows and columns don’t describe the most important part of your data.

The discussion of how to handle datatypes that don’t naturally fit into tables is complicated, which is why I spread it over several earlier posts. In essence, there are up to five choices for any particular datatype (standalone server, separate server integrated into RDBMS, wholly integrated into RDBMS, user-defined functions in RDBMS, or entirely outside a DBMS in simple files). Which one is best may vary greatly with your requirements for performance, transaction integrity, or query sophistication.

But datatypes aside, there’s another reason to leave the relational paradigm, and here I’m saying something much more controversial. To wit, I assert:

Even when data is alphanumeric, it may not belong in a rigid schema.

Clearly, there only are three reasonable opinions on the matter:

  1. Absolutely all alphanumeric data belongs in fixed schemas. To say otherwise is fuzzy thinking.
  2. In a few extreme or trivial cases, a fixed schema may miss the point. But in almost all cases that matter, you should use a fixed schema.
  3. In a significant (and growing) minority of cases, fixed schemas are counterproductive.

Here’s why I take Stance #3.

There are basically two kinds of new application — those that rely on old data and databases, and those that manage essentially new information. I’m talking about the latter kind. Those are often the ones that are highest value and most interesting, and certainly are the ones most apt to require new database management systems. Examples of this new data include:

  • User profiles and web surfing behavior.
  • Data gathered via innovative marketing campaigns.
  • Location information collected via various kinds of new devices.

In many of these cases, the information you can obtain varies from one subject to the next, because it’s based on their consent, lifestyle, or use of certain devices. Or it varies from one marketing campaign to the next. Or it varies from one country to the next, due to data privacy laws. Or it varies from one quarter to the next, due to ever-advancing technology. That kind of variability is only going to increase. And is it does, fixed schemas will – at least for some applications – seem increasingly quaint.

So how does one do database management without fixed schemas? To date, not very well. But XML databases are getting better, text search vendors are getting more serious about providing DBMS-like programmability, and object-oriented DBMS aren’t quite dead yet.

More and more, fluid-schema databases will seem both natural and necessary.

At least, that’s how I think things will play out.

The complete series

 

Please subscribe to our feed!

 

Tags: ,
Feb
10
Posted on 10-02-2008
Filed Under (Readings) by admin on 10-02-2008

One of the open secrets of the technology industry is that many — if not most — technology analyst firms are “pay for praise.”

Such analyst firms conduct research designed to flatter their clients who sponsor that research. Those clients then promote those “objective” research results as justification of their innovation leadership, and as proof of their marketing hyperbole.

Often, when technology journalists hear about a company being named to a research firm’s list of leaders, the question we ask ourselves is, “How much did that cost?”

Some analyst firms are more honest than others, of course. And some are much worse. Lee Gomes, in a Jan. 30 story in The Wall Street Journal, wrote about the practice in “Vendors Still Paying For IT Research That Flatters Them.” His story hits the nail on the head.

Lee focused on one notorious flatterer, Aberdeen Group, beginning with,

There were many excesses during the Internet bubble; one involved the Aberdeen Group, which passed itself off as a technology consulting and research operation, but which was for the most part a “pay-for-praise” operation. If you saw an Aberdeen report saying that Acme MicroMacro sold world-class solutions, you could be sure that Acme had written Aberdeen a world-class check.

He continues that under its new owners, Harte-Hanks, Aberdeen has a new business model that discloses the vendor relationship:

The current Aberdeen comes up with a research topic, typically involving some new technology trend, and then approaches tech companies selling products associated with the trend. For what customers say is roughly $30,000 a company can become a report sponsor. Aberdeen, which wouldn’t discuss its fee, then sends questionnaires to tech users, asking about their current activities and future plans for the area in question. The reports are meant to be a snapshot of the marketplace and don’t mention specific companies.

The result, reports Lee:

The potential conflict in this approach, though, is clear. The reports are big business — there were 212 last year — each typically with four or five sponsors. But if much of your top line is dependent on getting tech companies to sponsor your research reports, you’ve got quite an incentive to design questionnaires that will yield the kind of reports tech vendors will want to sponsor.

In that regard, Aberdeen delivers. The reports seem to invariably discover that “best in class” companies use, or are thinking about using, or somehow embody, whatever technology the report happens to be discussing.

While Aberdeen is noteworthy for participation in such non-objective research, it’s surely not the only one. Think about the biggest, more influential analyst firms in IT. I’m sure you can think of several household names. Nearly all of them play the same game: their reports are meant to flatter their sponsors, not offer honest advice to enterprise IT managers who are relying on those reports to help make difficult technology decisions.

It’s a shame that when it comes to analysts, you just can’t trust them, most of the time.

Tags: ,
Feb
07
Posted on 07-02-2008
Filed Under (Analysts, Open Source) by italovignoli on 07-02-2008

I’m in Napa for the Open Source ThinkTank. It’s a big event for the names attending but it’s a small event for the numbers as there are only 129 people from all over the world (actually, I should say from all over the States, plus a few from Europe).

I want to understand if the concept of marketing for the open source environment that I have developed over the last three years working as a volunteer for OpenOffice.org is a sustainable one. In Italy, the results have been just incredible.

In the next couple of days I will have the opportunity to exchange ideas with some of the most brilliant minds in this industry. I have a couple of meetings set with Andy Astor, CEO of Enterprise DB, and Marten Mickos, CEO of MySQL, but I am looking forward to meet - amongst the others - Matt Asay of Alfresco and Raven Zachary of The 451 Group.

Technorati Tags: ,

Powered by ScribeFire.

Tags: , ,
Feb
01
Posted on 01-02-2008
Filed Under (Readings) by admin on 01-02-2008

Microsoft claims it was IBM who really screwed the whole OOXML standardization process. I beg to differ. Ars Aperta did most of the work! And it does not stop there. Ars Aperta is secretly in command of IBM (that’s the High Command Center against Microsoft and Everything American I have been mentioning here from time to time. And Google and all that? Muwarr Harrh Harrrrhhh…. Sbires of mine they are, young padawan! Thus you can imagine how outraged I’m feeling when I’m reading this display of blatant ignorance. It shall be corrected in the future!

As a retaliation, Microsoft is bidding for nothing smaller than Yahoo! Good thing the DOJ is “interested” in the case. Remember who’s the monopoly in the story?

Brilliant idea, Novell, just brilliant. You’re on your way to proprietary heaven, folks. Sincerely, who’s in charge of your marketing? “Migrating to Vista? We can help. Novell”. I realize you want to push your quite profitable ZenWorks solutions, but please, keep some decency. At least I know what to show to Miguel De Icaza the next time he’ll tell me about Free Software.

Enjoy your week-end!

Tags: ,